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(54)'ntle: NUCXEIC ACQ) ANALYSIS TBCHNK2UES 
(57) Abatract 

Hie piesenc invcntioo prawido a dmplified method for identifying difVeiencei 
between two or more ample*. The roelfaod i mvolve providing an anay containing a 
g to dft J difTerent oligonu c ic i o ti d e probes artiefe the sequence and locatioo of each 
mRNA) from two or more sampkai trc byfaridized to the probe arrays and the pa 
hytaridizBtion patterns between the samples indicaies d iffeience s in eipression of varioik 



pravidea a medtod of end-labeling a nuckk acid. In one cmbodinnil, the method 
oligonucleotide and then enzymatkally ligating die otigonDcleotide to the nucleic 
RN A a labeled oligoribcju mJ eutMe can be Ugaisd using an RNA Ugase. In 
praviding a nicleic addL provldliit l a b e le d nucleoside nfphosphaiBs, and adachJnf the 




nucleic acid abundances (e.g., expraaaioD kveb) 
1 irge number (eg. gteater dien i/)00) of aibitfirily 
probe b kmmn. Nocieic acid samples (e.g. 
of hybrkfization ts detected. Differences in the 
SEoes between those tampfes. This invcntkn also 



dil Teiem i 



inxohea providing a mickk acid, providing a labeled 
Thus, for eum|4e, where the nodeic acid is an 
be jk ' ^^ nplishe d by 



u n d eo a ld ft bJi|Jriwtiluum tt> die nucleic add mb^ a 



W097/njl7 



81 



I displi ly 



filst 



fo] m: 
itid 



The principle behind differential 
primed amplification (e.g.. PGR) fiagments ftora a 
transcribed from RNA using anchor primers of the 

fn.VA,(T).VG,(T).VC, 
in which V is A, G, or C. and n ranges fiom about 6 
about 20 and more prefoably about lOto about J6 
Dependmg on what random primer and anchoring 
diflcrcni sets of cDNA transcripts are rqxrescnted in a 
These amplification fragments are analyzed by sorting 
oUgonculcotide array where they hybridize based on 
fragemenL 

The method is iUustrated in Figures 16a 
synthesized by reverae transcriptio of poly(A) mRNA 
according to standard methods (Fig. 16a). The first 
amplification (e.g; via PGR) using iqstream primers 
site and one or more degenerate bases (N«A,C,G,T) at 
is then performed using the upstream primers the 
(e.^.. anchor primers (T)„VA. (DuVG, (T)„VQ (D„ 
site: 5'-CATGAGCTCNN). The resulting ampHficatior 
restriction endonuciease coxiesponding to the engineete 1 
sanq>le nucleic adds are then hybridized to a generic 
described above. 

The method is preferably perfonncd to 

thereby allowing use of the generic difference screening 

embodiment, the probe oligonucleotides comprise a 

nmainmg restriction site on the sample nucleic acids if 

proceeds as described above. 

The method allows analysis of several 

(nucleic adds) simultaneously, finthennore, sequence 

difEoentialfy abundant nucleic add. 

providing a 9 base tail (CATGAGCTC) the array 



(D.VT 

about 30, preferably fiom about 8 to 
n=14 being most preferred. 

anchoring proimer is chosen, 
particular nudeic add fragment set 
the fragments on a generic screening 
sequence at the 5' end of the 



wi h n= 



prir lerand 



tte 



For example where the deavagi 



can cox ipnse 



' is to genera! a set of randomly 
t strand cDNA population 



through 16e. First strand cDNA is 
ijsing an anchored poly(t) primer 
DNA acts as atemplate for 
an engineered restriction 
I he 3' end- Randomly primed PGR 

and a random primer 
i\ T and random primer e.g„ Sad 
fragments are then digested with a 
restriction sites. The resulting 
screem'ng array as 



stn nd 



CO [Uprising 2 



anchor primers { 



diiferences 



t\ro 



or more nudeic add samples 
methods of this invention. In one 
region complementary to the 
] iresenL The remaining analysis 



conitant 



the usand or even more "bands^ 
iijfoimation is also provided on the 
;e is with Sac I, 
probe oligonudeotides 



T - 
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havcing a complemcDtaiy 9 base constant region and 
possible 9 mers. This provides 17 nuclctides of scque w 
hybridization (9 mer constant + 8 mer variable). 
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^able regions comprising all 
ice infoimation for each 



r these 



S iy) Use of ligation to 

restriction selected nudeic acid hybi 
Ligation reactions can also be used in 
to subsample the sample nucleic acids at qjpioximatedy 
simultaneously provide additional sequence inl6nnati( »n 
10 embodiment, a high density airay is provided in whicl 
a nucleic add sequence complementary to the sense oi 
{see, e.g.. Fig. 14). The sample nucleic acids are 
specifically with a lestriction endonudease (e.g., Sau 
are then hybridized to tiie high density amy. Only 
15 complementary to the constant regions will bind to the 
restriction fragments will be preferentially selected. 

The anqr is also hybridized with a poo] 
comprising all possiUe oligonucleotides of a 
presence of a ligase thereby ligating the complementaiV 
20 terminus of tltt probe oligonucleotide. This produces 
length by the length of the ligatable oligonucleotide 
known to be present in the nucleic acid sample. 

The DNA is then stripped ofTof Uie 
to perform generic difference screening of the nucleic 
25 When probes correspondizig to nucleic acid differentia ly 
are iden t ified, the known probe sequence can be used 
differentially expressed in the samples. 

In one embodiment, this b accomplish^! 
oligomideotides comprising the constam region plus 
30 additional nucleotide (A, G,C, or T) on one end. The 
a second lestricticHi enzyme and ligated to an 



extract addtHtma 1 sequence information from 

iriilzotiotts. 
c ombination with restriction digests 
muform intervals and 
using a ligation reactioiL In this 
the probe oligonucleotides comprise 
anti sense strand of a restriction site 
diges ted randomly with a DNAse or 
)A). The digested oligoncleotides 

nucleic acids having tennini 
probe oligonucleotides. Thus, the 



^amiy 



1 adaptor ssqueoce. 



particulai length {e.g.. 



of ligatable oligonucleotides 
. a 6 mer) in the 
ligatable oligonucleotides to the 
] nobe oligonucleotides increased in 
complementary to nucleic acids 



and the elongated probes are used 
icid samples as described above, 
expressed in the various samples 
identify the nucleic acids that are 



tlie 



by producing 4 primer 
known variable region and an 
{coomic clone is tiien digested with 
Usiiigthe4 primer 
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ck nes. 
c]>NA 



1) 



i oompIeiiM ntaiy 



olgionuclcotides and the adapter sequence as primcils 
be amplified (e.g., using PGR) fiom the genomic 
then be used to probe (e:;., inaSouthem blot) the 
of tnteresL 

For example, in one embodiment, a 
that it comprises all possible combination of 1 0 mer 
nucleic acids) and, at the beginning of each oUgonuc 
3*.TAGT.50, the fiist 4 bases of «Mch are 
restriction enzyme (e.g., Sau 3A plus one base T), 

Complete digestion of a large gcnomi< 
(e.g., a cDNA libraiy that only includes parts of the 5 
for example, a 4 cutter csaymc (illustrated herein by 
with a 5' overiiang sequence (for Sau 3A, the 
exists at approximately every 500 bp. 

When the DNA fragments are 
presence of all possible combmations of a ligatable 
(e.g., a 6 mer) and a T4 DNA ligase .the UgataUe 
oUgonudetide. 

The DNA is then stripped off the the 
is performed as described above. Thispennits 

that hyridize to nucleic acids that are present at diffcrelit 

Based on the 14 bp sequence m this 
plus 10 mcrs) from the probes of interest in the anay. 
adding one base (A, G,C or T) at the end. Usingthes: 
primers, the genomic sequence of interest can 

then be used to probe a cDNA libraiy to obtain the 
above. 



mer high denity array is designed so 
)ligonucleotidcs(/.e., 4*^1048576 
eotide, a constant sequence (e.g. 

to the recognition sequence of a 



clone or a simplified cDNA library 
end or 3' end of whole mRNA) with, 
J »au 3 A) generates DNA fiagmcnts 
isGATC). Hie recognition site 



shytvidizMl 



ibeampliSed. 
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the genomic sequence of interest can 
The PCR amplfiied sequence can 
library to obtain the whole cDNA 



with the 10 mer chip in the 
oijgonucleotide of a particular length 
olig{mucIeotide is ligiUed onto the probe 



clips 



and generic difference screening 

of probe olgioonculeotides 
levels in the tested san^les. 

mer constant region bases 
16 base primers are produced by 
primers and adaptor sequences as 
The amplified sequeiKe can * 
wh(|]e cDNA of interest as described 



s identifi :ation 
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DC Signal EvaluadoiL 

A) Signal EvaUuOomfor expression motutorin p. 
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Uu 



app -eciate t 
.ths 



One of skill in the art will appreciate 
hybridization results vary with the nature of the spccifn 
the GontiDls provided. la the simplest embodiment, siiAple 
fluoiesoence intensity for each probe is determined. 
5 measuring probe signal strength at each location 

high density array (e.^.. where the label is a fluorescent 
fluorescence (intensity) produced by a fixed excitation 
anay). Comparison of the absolute intensities of an 
a "test" sample with intensities produced by a "contror 
10 relative abundance of the nucleic acids that hybridize to 
One of skill in the art, however, will 
will vary in strength widi efficiency of hybridization, 
nucleic acid and the amount of the particular nucleic 
acids present at very low levels (e.g.» < will 
15 level of concentration, the signal becomes virtually 

evaluating the hybridization data, a threshold intensity 
a signal is not counted 'as being essentially indistinguishbb] 

Where it is desirable to detect nucleic 
lower threshold is chosetL Conversely, where only higli 
20 evaluated a higher threshold level is selected. In a 
threshold is about 10% above that of the average 

In addition, the provision oi 
analjrsis that controb for variations in hybridization 
binding and the like. Thus, for example, in a 
25 array is provided with normalization controls as described 
These normalization controls are probes complementaiy 
kix>wnconoentratioa to the sample. Where the overall 
the normalization controls will show a smaller signal rel 
Conversely, where hybridization conditions are good, 
30 provide a higher signal reflecting the 

derived from other i»6bes in the anay to the noimalizatilon 



\aluei 



,th5 



i improved hybridiz itioiu 
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methods for evaluating the 
probe nucleic acids used as well as 

quantification of the 
is accomplished simply by 
a different probe) on the 
label, detection of the anK>unt of 
iumination at each location on the 
hybridized to nucleic acids from 
san^ile provides a measure of the 
each of the probes. 

that hybridization signals 
amount of label on the sample 
in the sample. Typically nucleic 
very weak signal. At some low 
indi^guishable from background. In 
may be selected below which 
firom background, 
expressed at lower levek, a 
expression levels are to be 
a suitable 

signal. 

a more detailed 
conjlitions, cell health, mm-specific 
the hybridization 
above in Section IV(AX2). 
to control sequences added in a 
\ ybridization conditions are poor, 
Ijlecting reduced hybridization, 
normalization controls wilt 

Normalization of the signal 
controte thus ]m>vides a control 



:acils 



prefc rrcd embodiment, i 



;backgi}tuul 
f appropriate < ontrols petmits f 



preferred < mbodiment, 1 



T 
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I contio s. 



airay< 



,pau5 



mismatchi 



si lould I 



for variations in array synlhcsis or in hybridizaiioi 
acccunplished by dividing tbe measuxed signal from 
average signal pioduced by the noimalization 
correction for variations due to sample preparation 
may be accomplished by dividing the measured sign^ 
sample prqwration/amplfication control probes (e,g. 
values may be multiplied by a constant value to scale 

As indicated above, the high density 
in the case of generic diffeience sciecning arrays, 
differing in one or more preselected nucleotides. In 
am^s, there is a mismatch control having a central 
noimalization controls) in the amy. It is expected that 
conditions, where a perfect match would be eiqiected 
the mismatch, the signal from die mismatch controls 

binding or the presence in the sample of a nucleic acid 
In expression monitoring analyses, where both the 

mismatch control both show high signals, or the 
corresponding test probe, the signal fiom those probes 
difference in hybridization signal intensity 
Gcmeqionding mismatch control is a measure of the 
probe. Thus, in a prcfiaied embodiment, die signal of tlie 
from the signal fifom its coiresponding test probe to 
specific binding ofthc test probe. Similar, as discussed 
screening, the difference between probe pairs is calcula cd. 

The concentration of a particular sequeni c 
measuring the signal intensity of each of the probes that 
acid and normalizing to die normalization controls, 
greater than the mismatch, the mismatch is subtracted, 
equal to or greater than its corre^nding test 
level of a particular gene can then be scored 
absolute or above a threshold vahie), the intensity of the 



(prolie 



Whsre 



t probe, the signal 
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c >nditions. Typically, normalization is 
he other probes in the array by the 
Normalization may also include 
ai|d amplification. Such normalization 
by the average signal fiom die 
the probes). The resulting 
the results. 

can include mismatch controls or, 
of rehited oligonudeotie probes 
p eferred expression monitoring 

for every probe (except the 
after washing in stringent 
> hybridize to the probe, but not to 
primarily reflect non-specific 
that hybridizes with the mismatch, 
in question and its corresponding 
shows a higher ^gnal than its 
s|ffeferably ignored. The 
and its 
of the target-specific 
niismatch j^be is subtracted 
a measure of the signal due to 
below, in generic difference 



between the target specific probe j 
dis :riniination < 



providei 



can then be determined by 
bind ^xdfically to that nucleic 
die signal fiom the probes is 
Vhcrc the mismatch intensity is 
is ignored. The expression 
signals (cither 
positive signals (either absolute or 



by the rmml ler of posidve 
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above a selected threshold vahie), or a 
average). 

It is a surprising discoveiy of this invenl|ion, 
often uimeoessaiy for useful quantificatioD of a 
probes have been identified in the two step selection 
1 V(BXuXa)» the average hybridization signal produced 
provides a good quantified measure of the concentratioi i 



combination of I >oth metrics (e.g., a weighted 



I hybridi: ation 



pn tcess 



that normalization controls are 
signal. Thus, where optimal 
as described above, in Section 
3y the selected optimal probes 
of hybridized nucleic acid. 



B) Sigma evaittatioHjdrgaurk dpmtce seJ^ing. 

Signal evaluation for generic difference 
essentially the same manner as expression monitoring 
evaluated on a probe-by-probe basis rather than a gene 

In a preferred embodiment, for each probb oligonucleotide 
intensity difference between the members of each prob( 



j creening is performed in 
d bribed above. However, data is 
gene basis. 

the signal 
pair (K) is calculated as: 



ind cates 



sai ^ile, j 



where X is the hybridization intensity of the probe, i 
sample 1 or 2), and j indicates which replicate for each s mq>I 
where there were two lepltcates for each nucleic add 
pair ID number (in the case of Example 7, 1. . . 34320), 
probe pair, vdiile 2 indicates the other member of the pn 

The differences between the signal 
between the replicates for each sample is then calculated 
differences between replicate I and 2 of sample 1 (e.g. a 
between r^licate I and replicate 2 of sample 2 (e.g., 
calculated as 

for k-1 to the total number of probes. 

The replicates can be normalized to each 



. ath^ 
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which sample (in this case 
!e (in the case of Example 7 
isl or2XKisthepn)be 
and 1 indicates one member of the 
be pair. 

Y difference for each probe pair 
Hius, for example, the 
nonnal the normal cell line) and 
tunK>r cell line) for each probe is' 



Other as: 
or sample 1 or 
for sample 2 



r - 
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for all probe pairs (Le., after nonnalizatioii, the avei|age 

Finally, tbe the differences between sample 
replicates is calculated Tliis value is calculated as 

after nonnalization between Ac two samples based 
t(X2m+ 3C22M)/2]/r(X, 
This data is plotted as a function of probe number 
hybridized nucleic acids are readily disocmable (see. 

However, the data may also be fiheiet ! 
instance, after normalization between replicates (sec 
follows: If the absolute value of (X.,„-X„„y(X,^,.^ 
niiio=(X„„.X„tty(X,2,.-X i2u) else the ratio- pc,a,-: 

Hie ratio of replicate 1 and 2 of sample 
oligonucleotide pair, is calculated m the same way. 

Finally, as above, the ratio of sample 1 and sample 2 . 
difference of each oligonucleotide paiiis calculated as 
absolute value of 

HXii^+Xi2k2>/21/[ 
after normalization as described above. 

The oligoDuclectide pairs that show the 
between the two samples can be identified by sorting . 
difference values. The oligonucleotides that show the 
can be readily seen fiom the ratio plot {see, e g. Fig. I7i). 



bit 



r HTtose Expression Is AlteretL 



a: IdenUficadonofGme 

As indicated above, the nucleic acid 
oligonucleotides comprising the high density arrays are 
probes showing die largest hybridization differences 



(anl 
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ratio should approximate 1). 
and 2 averaged over the two 

tbe average ratio of 

and probes having differentially 
Fig. J6c). 

to reduce background signal. In this 
ibove), the ratio is calculated as 
aa)^ 1» then the 
^ia2yCX„4,-X, m) (the inverse). 
2 for the difference of each 
based on the absolute value of 
'3 >22k2l and 

alvcragcd over two replicates for die 
iiFig. 17a, but based on the 

"'Xi2k2)/2Jand 
(^i»i+3^)/2J 



greatest differential hybridization 
observed hybridization ratio and 
I ngest change (increase or decrease) 



tie 



ofthepiobe 
mown. The sequences of the 
families of such differences) < 



i specific for the 4 lifferenccs. 
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be used to identiiy the difierentiBlly expressed genes 
number of means. 

Thus, for example^ sequences 
be used to search a nucleic acid database (cLg.. by a 
5 fiagments against all known sequences). Altemativels 
using the femilies of probes that change by similar 
d atabase search for known genes that include 
complemeniaty ) to the probes that change the most is 
generally easier than sequence ceconstniction is the 
10 differentiaUy expressed sequences. 

In another embodiment, the 
there are significant diffetences in the overall expressic n 
samples, and identifies |Hobes that are 
used as specific affinity reagents to extract fiom the 
15 be accomplished in several ways: 

In one approach, the material hybridizec , 
differences between samples can be micro-extracted 
example, the hybridized nucleic acids can be removed 
Alternatively probes that are anchored to the chip with 
20 by selective inadiation at the desired ports of the 
In another approach, because the 
density anay is known, and the 
the latter can be used as affinity reagents to extract the 
hybridize in the test samples. Once the diffeientiaUy 
25 the array, the probe (or probes) can be synthesized on 
hybridized to the sanq>les (not necessarily fragmented 
be desirable). The material diat is extracted can 
standard methods known to those of skill in the ait, to 
the differentially expressed species (e.g. clones can be 
30 oligoimcleotides to determine ones with appropriate 
sequenced). 



i 1 the compared samples by any of a 



of the dii ferentially hybridizing probes may 
or related search of the 
some sequence reconstruction 
amounts can also be done. The 
(or neariy 
not difficult and because it is 
pn ferred method for identifying the 



B],AST. 



sequena s complementary ( 



differential [hybridization pattern indicates that 
profile(s) between the tested 
These probes can be 
the parts that dififer. This can 



i probes that hybridize d ffcrentially 



lort 



I be clo] led 
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fritm 1 



highn tensity 2 



s sequel ce 



to the probes that show the greatest 

the high density array.- For 
■sing small capillaries, 
a photolabile linker can be released 
array. 

of all the probes on the high- 
have been identified, 
ludeic acids that differentially 
b ^bridizing probes are identified in 
t eads (or other solid support) and 
this step -full length clones may 
and/or sequenced, according to 
< obtain the desired information about 
creened with labeled 
inserts, and/or randomly chosen and 



T 
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dnings 



In stiil another approach, the sequence 
can be used to generate amplification primere (eg. 
primen). The diffoentially expressed sequence can 
to probe a genomic or cDNA library using sequence 
array in combination with specific sequences added 
step as described above (e.^., primerbased on poly A 
appropriate cloning and sequencing techniques, and 

of sldll through many doning exerrises are found in 
Molecular Cloning Techniques, Methods in 
San Diego, CA; Sambrook et al. (1989) Molecular 
cd,) Vol. 1-3; and Currem Protocols in Molecular Bio 
. Current Protocols, a joint venture between 

Wiley & Sons. Inc^ (1994 Supplement) (Ausubel). 
manufecturcrs of Hological reagents and ejqjcrimental 
useful in known biological methods. Such manufacturjas 
am^y (Saint Louis, MO), R&D systems 
Biotechnology (Piscalaway, NJ), CLONTECH 
Genes Corp.. Aldrich Chemical Company (Milwaukeej 
BRL Life Technologies, Inc. (Gaithcrsbeig. MD), Fluk i 
(Fluka Chemie AG. Buchs, Switzerland), Invitiogen. 
Biosystems (Foster aty. CAX as wcU as many other 
skUl. 

In short, using the above-described 
can be identified without prior assumptions about whicl 
prior knowledge of sequence. Once identified (and 
sequenced gene), the new seqiiences can be included in 
detect and quantify specific genes in the same way as 
No. 08/529,1 15 fUed on September 1 5. 1995 and 

^jproaches are ccnnplementary in that one can be used 
diflfercnccs of peihaps unknown genes,, viiule the other 



r Enzymolo, volume 1 



1 Greene Pub Ushing 



Pr xluct i 



i(Minneapolis,MN), 
I Laborai Dries, 



.&in 
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of the hybridized probes of interest 
verse transcription and/or PCR 
1 lien be amplified and used as aprobe 
^recific primers determined fiom the 
a reverse transcriptase cDN A 
added 3* sequence). Examples of 
i4structions sufficient to direct persons 
^erger and fCimmel, Guide to 

1 52 Academic Press. Inc., 
Cloning - A Laboratory Manued (2nd 
^ogy, ¥M. Ausubel etal.^ eds.. 
Associates, Inc. and John 
mformation fiom 
equipment also provide infonnation 
include the SIGMA chemical 
I, Pharmacia LKB 
Inc. (Palo Alto, CA).Chem 
WI), Glen Research. Inc., GIBCO 
Chemica-Biocbemika Analytika 
Diego, CA, and Applied 

sources known to one of 



meth id, dififeientially expressed i 



genes 

genes to monitor and without 

if not a previously 
high density array designed to 
in copending plications 
. TTius.thetwo 
broadly search for expression 
used to more specifically 



sequenced 



described 
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monitor those genes that have been chosen as importan 
previously at least partially sequenced. 



PCmJS97A>160S 



or those genes that have been 



i pools of different tigatabli 



A7. Kits for Expression Monitoring and 

In anodier embodiment, this invention 
monitoring and/or generic difference screening. The 
container or containers containing one or more 
invention. Preferred kits for generic difference scieeniijg 
arrays. The kits can also include a label or labels for 
samples. In addition, the kits can include one or more 
certain embodiments, the kit contains 
preferably pools of every possible oligonucleotide of a 
mers) or sets of specific ligatable oligonucleotides. One 
that the kits may incfaide any other of the various 
tzays, microscope filters, syringes, ere) buffers, and 
hybridizations and ligation reactions described herein, 
software provided on a storage medium (e.g.. optical or 
probes and/or the analysis of hybridization data as 
may contain instiuctional materials teaching the use 
this invention (e.g.. in practice of various 
difference screening methods described herein). 



p] ovides Ii 



i expression mc mtoiing 



Xn* ComptiUr^Implemettted Expression Ml niioring. 

The methods of monitoring gene 
performed utilizing a computer. The computer typicall) 
includes computer code incorporating the invention for 
measured from a substrate or chip and thus, monitoring 
genes or screening for differences in nucleic add 
describe specific embodiments of the invention, the 
embodiment so die following is for purposes of illustration 



Generic Difference Screening. 
kits for expression 
include, but are not limited to a a 
high deijsity oligonucleotide arrays of this 
include at least two high density 
lal eling one or more nucleic acid 
ijgatable oligonucleotides. In 

le oligomicleotides, 
I^cuiar length (e.g.. all possible 6 
of skill in the ait will appreciate 
bets, devices (e.g 
the like useful for performing the 
addition, the kits may iiKlude 
xiagnetic disk) for the selection of 
herein. In addition, the kits 
kit in the various methods of 
methods or generic 



described 
oftie 



on of this invention may be 
runs a software program that 
malyzing hybridization intensities 
the expression of one or more 

Ahhough the following will 
is not limited to any one 
and not limitation. 



abundinces. 
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I comj utcr 



mcusel 



i mvent ion. 



flishi 



hird< 



1 block diagram oj compuicr 



Fig. 6 illustrates an example of a 
software of an embodiment of the present invention. 
100 includes a monitor 102, screen 104, cabinet 106, 
Mouse 1 10 may have one or more buttons siich as 
a CD-ROM drive 1 14, a system memory and a hard 
may be utilized to store and retrieve software piograir s 
implements the invention, data for use with the 
ROM 1 16 is shown as an exeDq)]ary computer readab! 
readable storage media including floppy disks, tape, 
hard drives may be utUized. Cabinet 106 also houses 
shown) such as a central processor, system memory 

Fig. 7 shows a system 
execute the software of an embodiment of die present 
system 100 includes monitor 102 and keyboard 108. 
includes subsystems such as a centra] processor 120, 
124, display adapter 126, removable disk 128 ie.g„ CI 
hard drive), network interface 132, and speaker 134 
use with the present invention may inchide additional 
another computer system could include more than one 
processor system) or a cache memory. 

Arrows such as 136 represent the systen i 
system 100. However, these arrows are illustrative of i ny 
to link the subsystems. For example, a local bus could 
processor to the system memory and display adapter. 
7 is but an example of a computer system suitable for 
Other configurations of subsystems suitable for use 
readily apparent to one of ordinary skill in the art 

Fig. 8 shows a flowchart of a process 
gene. The process con^tares hybridization intensities a 
mismatch probes that are prefeiably oovalently attached 
chip. Most preferably, the nucleic add probes have a 
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system used to execute the 
As shown, shows a computer system 
ceyboard 108. and mouse 1 10. 
buttons 112. Cabinet 106 houses 
dfivc (both shown in Fig. 7) which 
incorporating computer code that 
and the like. Although a CD- 
e storage medium, other computer 
memory, syrstem memory, and 
'amiliar computer components (not 
disk, and the like. 

system 100 used to 
i nventioa As in Fig. 6, computer 
(pomputer system 100 further 

memory 122, I/O controller 
ROM drive), fixed disk 130 (e,g., 
C|ther computer systems suitable for 
fewer subsystems. For example. 
STOcessor 120 (i.e., a multi- 
bus architecture of computer 

interconnection sdieme serving 
teutilizedto connect the central 
C omputer system 1 00 shown in Fig. 
5 with the present invention, 
the present invention will be 



of jnonitoring the oqjression of a 
pairs of perfect match and 
to the surfittc of a substrate or 
greater than about 60 



density] 



W097/mi7 



PCT/US97/016I» 



that 



twoi Id 



pre be. 



5 derived nake 



92 

different nucleic acid probes per 1 cm' of the substrate 
sequence of steps for clarity, this is not an indication 
this specific Older. Oxieofordinaiy skill in the ait 
the steps may be reordered, combined, and deleted 
5 hiitially, nucleic acid probes are selectei I 

target sequence (or gene). These probes are the perfect 
probes is specified that are intended to be not perfectly 
sequence. These probes are dse mismatch probes and 
least one nucleotide mismafffh finom a perfect match 
10 and the perfect match probe fiom \^ch it was 

mentioned earlier, the nucleotide mismatch is prefciaU^ 
probe. 

The probe lengths of the perfect match 
exhibit lugh hybridization affinity with the 

IS probes may be all 2(K-mers. However, probes of varying 
on the substrate for any number of reasons including 

The taiget sequence is typically 
substrate including the nucleic acid probes as 
intensities of the nucleic acid probes is then measured 

20 The computer system may be the same system that 
may be a difierent system altogether. Ofcouise,any 
invention shouM have available other details of the 
name, gene sequence, probe sequences, probe locaticms 
Referring to Fig. 8, after hybridization, 

25 of hybridization intensities of the multiple pairs oi 

step 202. The hybridization intensities indicate hybrii 
acid jRobes and the target nucleic acid (which 
perfect match probe that is perfectly complementary to 
and a mismatch probe that differs fiom the perfect 

30 At step 204, the computer system 

the perfect match and mi?gnatch probes of each pair. If 



Although the flowcharts show a 
the steps must be performed in 
readily recognize that many of 
fiom the invention, 
that are complementary to the 
match probes. Another set of 
complementary to the target 
mismatch probe includes at 
Accordiiigly, a mismatch probe 
up a pair of probes. As 
near the center of the mismatch 



will out departing f 



eich 



; target seque nee. 



f fiagmec ted. 



I described eariier. 



snd i 



tdire:ts 



esqinmenti 



ttie( 



of perf 



Idi zatio 



icorrespo ids 



tmauh 



lipobes are typically chosen to 

For example, the nucleic acid 
lengths may also be synthesized 
>lving ambiguities. 

labeled and exposed to a 

Hie hybridization 
input into a computer system, 
the substrate hybridization or it 
c(|mputer system for use with the 

including possibly the gene 
on the substrate, and the like, 
con^suter system receives input 
match and mismatch probes at 
on affinity between the nucleic 
to a gene). Each pair includes a 
a portion of the target nucleic acid 
probe by at least one imcleotide. 

intensities of 
the gene is expressed, the 



compi ires the hybridization i 
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substi ntially 



bri dizaiit 



t h]rbridiz ition 



sgae 



hybridization intensity (or afi^ty) of a perfect match 
recognizably higher than the coiresponding mismatch 
hylmdtZBtions intensities of a pair of piobcs 
geneisnotexiniessed However, the detemiiziation is 
the determination of whether a gene is expressed is 
probes. An exemplary process of comparing the hyl 
probes will be described in more detail in leference to 

After the system oonqnres the 
and mismatch probes, the system indicates expression 

example, the system may indicate to a user that the 

marginal or absent (tmexpressed). 

Fig. 9 shows a flowchart of a process of 

utilizing a decision matrix. At step 252, the computer 

pairs of perfect match and mianatch probes. In a 

intensities are photon counts from a 

probes on the substrate. For simplicity, the hybridizati( 

will be designed "I^" and the hybridization intensity 

Hybridization intensities for a pair of pnfbes 
background signal intensity is subtracted fiom each 
pair at step 256. Background subtraction ma] 
the same time. 

At step 258, the hybridization intensities 
to a difference threshold (D) and a ratio threshold (R). 
between the hybridization intensities of the pair (i^ - 
difference threshold AND the quotient of the hyfarii 
is greater than or equal to the ratio threshold. The 
defined values that have been detennined to produce 

gene or genes. In one embodiment, the difference thrcsljold 
1.2. 
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probe of a pair should be 
probe. Generally, if the 

the same, it m^ indicate the 
lot based on a single pair of probes, 
on an analysis of many pairs of 
ion intensities of the pairs of 
Fig, 9. 

intensity of the perfect match 
>fthegeneatstq>206. Asan 
is either present (expressed). 



I prefe led 
i fluorescein labeled target 



fo\ a 



loftie 



ay also be p ^formed 



ridization 



determining if a gene is expressed 
i System receives raw scan data of N 
embodiment, the l^bridization 
that has hybridized to the 
intensity of a perfect match probe 
mismatch probe will be designed 



is retrieved at step 254. The 
hybridization intensities of the 
on all the raw scan data at 



of the pair of probes are compared 
is determined if the difference 
is greater than or equal to the 
intensittes of the pair (I^^ / 1^) 
typically user 
expression monitoring of a 
is 20 and the ratio threshold is 



difiei mce thresholds t 



accurate 
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IfU-U>='DandIp./l„>=R,lhi5 
260. In general, hfPOS is a value that indicates the 
hybridization intensities indicating (ha! the gene is likcl^ 
detennination of the expression of the gene. 

At stq> 262, it is detennined if 1^ - 
expression is tnie, the value NNEG is incremented at 
value that indicates the number of pairs of probes whict 
indicating that the gene is likely not expressed. NNEG, 
detennination of the expression of the gene. 

For each pair that exhibits hybridization 
gene is expre^ed or not expressed, a log ratio value 
(IDIF) are calculated at step 266. LR is calculated by 
hybridization intensities of the pair (Ji^ / The WU 
between the hybridization intensities of the pair Op. - 
hybridization intensities at step 268, they are retrieved e 

At step 272, a decision matrix is utilized 
The decision manrix utilizes the values N, NPOS, NNE(^, 
following four assignments are performed: 

PI =NPOS/NNEG 

P2=NPOS/N 

P3 = (10 ♦ SUM(LR)) / (NPOS + NNEGl) 
These P values are then utilized to determine if die gene 

For purposes of illustration, the P values 
isgreaterthanorequaIto2.1,then A istrue. If PI is 
equal to 1.8, then B is true. Otherwise, C is true, 
ranges A, B and C. This is done to aid the readers 

Thus, all of the P values are broken dowk 



Njalue NPOS is incremented at step 
of pairs of probes which have 
expressed. NPOS is utilized in a 



I- J. 



following: 



A = (P1 >=2.1) 

B = (2.1>PI>-1.8) 

C = (P1<1.8) 



Dandl^/Ip„>=It Ifthis 
264. In general, NNEG is a 
have hybridization intensities 
like NPOS, is utilized in a 



Atensities either indicating the 
and intensity difference value 
log of the quotient of the 
is calculated by the difference 

If there is a next pair of 
step 254. 

to indicate if the gene is expressed, 
i, and LR (multiple LRs). Tbe 



are broken down into ranges. If PI 
le^ than 2.1 and greater than or 
ThusJPl is broken down into three 
understanding of the invention, 
into ranges according to die 
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X = (P2>=0.35) 
Y = (0.35 >P2>« 0^0) 
2*(P2<a20) 

Q-(P3>«IJ) 
R = (I.5>P3>=1.|) 
S»(P3<I.I) 

Once the P values arc broken down into ranges accord^ 
gene expression is detennined. 

The gene expression is indicated as 
(notcxfwcssed). The gene is indicated as expressed i 
and(XorY)and(QorR). In other words, the gene is 
P2>«0.20andP3>-l.l. Additionally, the gene is i 
expression is true: BandXandQ. 

With the foigotng e3q)]anadon, the 

expression 



{present 



iifllie 



(expressed), otaiginal or absent 
following expression is true: A 
iidicatedasejqiressedifPl >=2.1, 
expressed if the following 



follow ving is a summary of the gene 



Ptesent 



Marginal 



Absent 



A and (X or Y) ai]|d (Q or R) 
BandXandl 

AandXandS 
BandXandR 
BandYand(QoijR) 



All others cases (e g. 

In the oi^put to the user, present may be indicated as "P, 
•A" at step 274. 

Chii« all the pairs of probes have been processed and 
gene indicated, an avczage of ten times the LRs is ccmpi ted 

average of die IDIF values for the probes tfiat incrementt d 
These values may be utilized for quantitative comparisoi s of this 
experiments. 



to the above boolean values, the 



any C combination) 
marginal as "M" and absent as 

the expression of the 
at step 275. Additionally, an 
NPOS and NNEG is calculated, 
experiments with other 
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9k 
pcrfi] rmed 



QuantHodve measiueinents may be 
current experiment may be compared to a previous 
calculated at step 270). Additicmally, the esqjeiiment 
intensities of RNA (such as torn bacteria) present in th< 
quantity. In this manner, one may veri^ the conectnes! 



cxpc nment 



miy 



ferereru:et 



I bioloj ;ical 
indi( ation 



pr>cess< 



may t 



lliel 



or call, modify threshold vahies, or perform any number 
For simplicity. Fig. 9 was described in 

the process may be utilized on multiple genes in a hV 

discussion of the analysis of a single gene is not 
10 extended to processing multiple genes. 

Figs. IDA and lOB show the flow of a 

expression of a geite by comparing baseline scan data aijd 

ex8n4>le, the basdine scan data may be fiom a biologies 1 

gene is e xpr essed. Thus, diis scan data may be compare J 
15 to determine if the gene is eiqiressed. Additionally, it 

expression of a gene or genes changes over time in a 

At step 302, the computer system receive^ 

perfiect match and mismatch probes from the baseline. 

perfect match probe from the baseline will be designed ' 
20 ofa mismatch probe firom the baseline will be designed 

intensity is subtracted from each of the hybridization 

scan data at stq) 304. 

At step 306, the computer system reoeivcjs 

perfect match and mismatch probes from the cxperimen al 
25 hybridization intensity of a perfect match probes from 

"Jp," and the hybridization intensity ofa mismatch 

designed The background signal intensity is 

hybridization intensities of the pairs of experimental 

The hybridization intensities of an 1 and 
30 310. For exan^le, the hybridization intensities of the I 

hybridization intensity of control probes as discussed 



at step 276. For example, the 
{e.g., utilizing values 
be compared to hybridization 
biological sample in a known 
of the gene expression indication 
of modifications of the preceding, 
to a single gene. However, 
sample. Therefore, any 
that the process may not be 



of determining the 
experimental scan data. For 
sample where it is known the 
to a differem biological sample 
be determined how the 
biojlogical organism. 

raw scan data of N pairs of 
hybridization intensity ofa 
I^** and the hybridization mtensity 
1«» " The background signal 
intensities of the pairs of baseline 



ti>e 



raw scan data of N pairs of 
biological sample. The 
experiment will be designed 
from the experiment will be 

fiom each of the 
data at step 308. 
pair may be normalized at step 
uid J pairs may be divided by the 
in Section IV(A). 



iprobs 



subtracted 



alove 
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At step 3 12, the hybridization intensxti s 
compared to a difference thieshoid (DDIF) and a ratio 
if the difference between the hybridization intensities 
other pair (I^ * are greater than or equal to 
of the hybridization intensities of one pair (J^, - 
greater than or equal to the ratio threshold. The 
defined values that have been detennined to produce 
gene or genes. 

If (J^ - J-J - dp. • LJ >= DDIF and 
value NINC is incremented at step 314. In general, 
experimental pair of probes indicates that the gene 
increased) than the baseline sample. NINC is utilized 
expression of the gene is greater (or increased), less (oi 
experimental sample compared to the baseline sample. 

At step 316, it is determined if (J^, - 
"'-J^(I,»/I«J>=RDIF. Ifthis expression is true, NOEC 
NDEC is a value that indicates the experimental pail 
expression is likely less (or decreased) than the baselint : 
determination of whether the expression of the gene is 
decreased) or did not change in the experimental sampl j 

For each of the pairs that exhibits hybrid izati< 
the gene is expiessed more or less in the experimental 
and LR are calculated for each pair of probes. These 
above in reference to Fig. 9. A suffix of cither "B" or "] 
order to indicate if the value denotes the baseline sampl 
lespectively. Ifthere are next pairs of hybridization 
processed in a similar manner as shown. 

Referring now to Fig. lOB, an absolute 
for both the baseline and experimental samples at step 
computation is an indication of whether die gene is 
ofthebaselme and aq)erimental samples. Accordingly, 



of the I and J pair of probes are 
threshold (RDIF). It is determined 
)f the one pair (J^^ - and the 

threshold AND the quotient 
the other pair O^-I^ are 
are typically user 
ajxurate expression monitorixi^ of a 



the diJ ferenoe 



a [id 



diffen nee thresholds 2 



U - J-J ^ (Ip« - I«J >» RDIF, the 
is a value that indicates the 
is likely greater (or 
a determination of whether the 
<Iecreased) or did not change in the 



-(U-U)>= DDIF and (Jp„. 

is incremented. In general, 
probes indicates that the gene 
sample. NDEC is utilized in a 
i greater (or increased), less (or 
compared to the baseline sample, 
ion intensities either indicating 
s|nnple, the values NPOS, NNEG 
are calculated as discussed 
7 has been added to each value in 
or the experimental sample, 
intensities at step 322, they are 



valuesi 



dpcision computation is performed 
The absolute decision 
niarginal or absent in each 
in a prefencd embodiment, this 



324. 



expiessed,! 
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step entails perfonmng steps 272 and 274 from Fig. 9 
done» there is an indication of gene expression for each 

At Step 326, a decision matrix is utilized 
expression between the two samples. This decision 
NPOSE, NNEGB, NNEGE, NINC. NDEC. LRB, and 
The decision matrix perfbnns difierent calculations dep^ndiiig 
than or equal to NDEC. The calculations are as follows 

If NINC >= NDEC, the following four P 



I matrix 



each of the samples. This being 
»f the samples taken alone. 
O detennine the difiference in gene 
ix utilizes the values, N, NPOSB, 
as they were calculated above, 
on whdfaer NINC is greater 



values are deterauned: 



P1»NINC/NDEC 
P2 = NINC/N 

P3 = ((NPOSE - NPOSB) - (NNEGE - N^GB)) / N 

P4 -> 10 * SUM(LR£ - LRB) / N 
These P values are then utilized to detennine the differed in gene expression between the 
two seniles. 

For pmposes of illustration, the P values 
was done i»eviously. Thus, all of the P values are brokqn 
the followii^: 

A«(P1 >=2.7) 
B = (2.7>P1>«1.8) 
C»(P1<1.8) 



X»(P2>-0.24) 

Y = (0J>4>P2>=0.16) 

Z = (P2< 0.160) 

M = (P3>=0.17) 

N = (0.17 >P3>= 0.10) 

O«(P3<0.10) 

Q = (P4>=1.3) 



ire broken down into ranges as 
down into ranges according to 
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R = (1.3>P4>=0.9) 
S = (P4<0.9) 



Once the P values are broken down into langes acconiing 
differenoe in gene expression between the two sampl( 

In this case where NINC>= NDEC, tht 
as increased, maiginal increase or no change. The foil 
expression indirations: 



to the above boolean values, the 
isdetennined. 

gene expression change is indicated 
owing is a summary of the gene 



Increased 



Aand(XorY) 
Aand(XorY) 
B and (X or Y) 
AandXand(Q 



I nd (Q or R) and (M or N or O) 
a nd (Q or R or S) and (M or N) 

ahd (Q or R) and (M orN) 

RorS)and(MorNorO) 



Increase 



NoChaiige 



Aor YorSorO 
Band (X or Y) 
Band (X or Y) 
Cand(XorY) 



All otiwrs cases 0 



I," marginal increase as "MI" and 



In the output to the user, increased may be indicated as 
no change as 

If NINC < NDEC. the following four P klues are determined: 



Pl-NDEC/NDMC 
P2-NDEC/N 

P3 = ((NNEGE - NNEGB) - (NPOSE - ijfPOSB)) / N 
P4 « 10 • SUMCLRE - LRB)/N 



These P values are then utilized to determine die di£fereUe 
two samples. 



ai k1 (Q or R) and O 
aiidSand(Mor N) 
aild (Q or R) and (M or N) 



g.^ any Z combination) 



ui gene expression between the 
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The P values arc broken down into the 
where NINC NDEC. Thus, P values in this case 
be repeated for the sake of brevity. However, the ranges 
changes in the gene expression between the two samples 

In this case where NINC < NDEC, the 
as decreased, marginal decrease or no change. Thefolloilnng 
expfession indications: 



ranges as for the other case 
the same ranges and will not 
generally indicate dififerent 
as shown below, 
go e esqpfession change is indicated 
b asummary of the gene 



Decreased 



Aand(XorY)and 
Aand(XorY)and|(Q 
Band (X or Y) and 
A and X and (Q or 1 



(Q or R) and (M or N or O) 
orRorS)and(MorN) 
(QorR)and(MorN) 
orS)and(MorNorO) 



Marginal 
Decrease 



No Change 



A or Y or S or O 
Band (X or Y) and 
Band (X or Y) and 
Cand(XorY)and|CQ 



In the output to the user, decreased may be indicated as "b," marginal decrease as "MD** 
and no change as *NC." 

The above has shown that the relative dififibence between 
expression between a baseline sample and an experiment J 
additional test may be perfomied that would change an 1, 
to NC if the gene is indicated as expressed in both sampli 
following expressions are all true: 



Average(IDIFB) >= 200 
Averagc(IDIFE) >« 200 
1 .4 >= Average(IDIFE) / Avenge(IDIFB) 



[Q orR)andO 
Sand(MorN) 

orR)and(MorN) 



All others cases (e.j any Z combination) 



the gene 

sample may be determined. An 
MI, D, or MD {i.e., not NC) call 
{e.g., from step 324) and the 



'0.7 
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Thus, when a gene is expressed in both samples^ a call 
marginal or not) will be changed to a no change call 
each sample is relatively large or substantially the 
IDIFE are calculated as the sum of all the IDIFs for 
At step 328, values for quantitative di 
average of ((Jp„ - - (I^ . I^) for each of the pain 
quotient of the average of - J„ and the average of] 
may be utilized to compare the resuhs with 



of increased or decreased (whether 
iflthe average intensity difference for 
for both samples. TbelDIFBand 
eath sample divided by N. 
diiipfence evaluation are calculated. An 
is calculated. Additionally, a 
p» • lasH 15 calculated. These values 
in step 330. 



1 other expeiiments 



EXAMPLES 
The following examples are offered to 



ii lustrate, but not to limit the present 



mvention. 



Exan^el 

First GenenUkm 

Levels for a Small Number oj 
A) Pnpanaioii ofLabeUd RJVA. 

I) From Each of the Preselected Genes 

Fourteen genes (IL-2. IL-3, ll-4» IL-6. II- 
TNF-o, CTLA8, B-actin, GAPDH. IL-l I receptor, and 
Blucscript U KS (+) phagemid (Stratagcne, U JoUa, 
the insert was such that T3 RNA polymerase gave sense 
gave antisense RNA. 

Labeled ribomideotides in an in vitro 
biotin- or fluoresccin-labeled UTP and CTP (1 :3 labclet 
and GTP were used for the reaction with 2500 units 
Technologies, Madison, Wisconsin, USA). In vitro 
templates in a manner like that described by Melton er 
7035-7056 (1984). A typical in vitro transcription 
buffer such as thai included in Ambion's Maxiscript in 
Inc., Huston, Texas, USA) and GTP (3 mM), ATP (1 .5 



Ol^onueleotide Arrays I esigned to Measure mRNA 
ifMlfrine Cytokines. 



10, IL-12p40, GM-CSF, IFN-y, 
: )io B) were each cloned into the p 
Capifoniia, USA). The orientation of 
transcripts and T7 polymerase 



tra^iscription (TVT) reaction. Either 

to unlabeled) plus unlabeled ATP 
of T7 RNA polymerase (Epicentre 
tranjscription was done with cut 
Nucleic Acids Research^ 12: 
reaction used 5 ^g DNA template, a 
vfefro Transcription Kit (Ambion 
loM), and CTP and fluoresceinated 



wo 97/27317 



FCT/I)S97>D1603 



10 



IS 



20 



25 



30 



UTP (3 mM total, UTP: Fl-UTP 3 : 1 ) or UTP and fluore: ceinated CTP (2 mM total, CTP: 
FI-CTP, 3:1). Reactions done in the Ambion buffer had [20 mM DTT and RNase inhibitor. 
The reaction was nan from l.S to about 8 hours. 

Foilowing the reaction, unmooiporated 
removed using a size-selective membrane (microcon-l 0(|) 
column. The total molar concentration of RN A was 
absorfoance at 260 nm. Following quantitation of RNA 
randomly to an average length of approximately 50 - 10( 
mM Tris^acetate pH 8.1, 100 mM potassium acetate, 30 
40 minutes. Fragmentation reduces possible interfereno 
and minimizes the effects of multiple interactions with c osely 



sbasod 



ni|cleotide triphosphates were 
or Phamiacia microspin S-200 
on a measurement of the 
founts, RNA was fragmented 
bases by heating at 94 X in 40 
nM magnesium acetate for 30 - 
from RNA secondary structure, 
spaced probe molecules. 



of wo 



2) FrwncJONAUAnria. 
Labeled RNA was produced from one 

cell plasmacytoma which was known not to express the 
GAPDH) used as target genes in this study, and 2D6, an|lL*l2 
line (Th, subtype) that is known to express most of the 
study. Thus, RNA derived from the TIO cell line 
mixture suitable for spiking with known quantities of RIpA 
genes. In contrast, mRN A derived from the 2D6 cell 
providing typical endogenously transcribed amounts 



s provided 



5 oft he 



lib ary, 



0 The TIO nmrine B eeiliiiu. 

The TIO cell line (B cells) was derived 
plasmacytoma line Tl 165 (Nordan et ai. (1986) Science 
presence of IL-11. To prepare the directional cDN A 
isolated from TIO cells using RNAStat60 (Tel-Test B), 
using the PolyAtract kit (Promega, Madison, Wisconsin , 
cDNA was synthesized according to Toole et al.y (1984] 
5-methyldeoxycytidine 5*triphosphate (Pharmacia LKB 
was substituted for DCTP in both reactions. 



genest 



murine cell lines; TIO, a B 
i ;enes (except IL-10, actin and 
growth dependent T cell 
used as target genes in this 
a good total RNA baseline 
from the particular target 
provided a good positive control 
RNA from the target genes. 



fr >m 



the IL'6 dependent murine 
233: 566-569) by selection in the 

, total cellular RNA was 
md poly (A)^ RNA was selected 
USA). First and second strand 
Nature, 3 12: 342-347, except that 
Piscataway, New Jersey, USA) 
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To detcnninc cDNA frequencies TIO K 
transfered to nitroceUuiose filters and probed with ^ 
irobes. Actin was itprescnted at a frequency of 1:300) 
1:35»000. Labeled sense and antiscnseTJORNAsami li 
Sfil cut CDNA lihxaries in in viiro transcriptiott 



ireacticns 



)ranes were plated, and DNA was 
abeled P-^ctin, GAPDH and IL-10 
GAPDH at 1;I000, and IL-10 at 
les were synthesi zed from NotI and 
as described above. 



free I 



ii} The 206 murine ke^Tcdfs line. 

The 2D6 cell line is a murine IL-12 
Fujiwaraero/. Cells were cultured in RPMI 1640 medium 
calf scniin (JRH Biosciences), 0.05 mM P-n 
(100 units/mL, Genetics Institute, Cambridge, Massach 
induction, cells were preincubated overnight in IL-12 
(10^ cells/ml). After tncobatiafi for 0,2, 6 and 24 hours 
ionophore A23 1 87 (Sigma Chemical Co., St Louis 
4-phori)ol-I2-myristate 13-acetate (Sigma), cells were 
washed once with pho^hate buffered saline prior to 

Labeled 2D6 mRNA was 
with oZipLox, NoU-Sall arms available from GibcoBRt 
linearized pZll libiaiy was transcribed wi A T7 to 



Missouri,! 



>generue 



(I'^BO) 



r select on 



^mAprepan^oiL 
For material made directly from cellular 
extracted from cells by the method of Favaloro et al. 
and poly (A)* RNA was isolated with an oUgo dT 
RNA was amplified using a modification of the procedu e 
(1992) Proc, Nad. Acad Sci USA, 89: 3010-3014 {see 
Science %1: 1663-1667). One microgram of poly (AyK 
double-stranded cDNA using a cDNA synthesis kit 
prime incorporating a T7 RNA polymerase promoter site . 
the reaction mixture was extracted with pheaol/chiorofoi n 
isolated usir^g a membrane fUtradon step (Mircocon-100 



nt T cell line developed by 
with 10% heat inactivated fetal 
i and recombinant murine IL-12 
isetts, USA). For cytokine 

medium and then resuspended 
in media containing 5 nM calcium 
USA)andI00nM 
cjsllected by centrifugation and 
of RNA. 

produced by di|tectionally cloning the 2D6 cDNA 
in a manner similar to TIO. The 
sense RNA as described above. 



] (NA. cytoplasmic RNA was 
Meth Enzym., 65: 718-749, 
step (PoIyAtract, Promcga, ). 
described by Eberwine et al. 
VanGcldcrc/fl/. (1990) 
was converted into 

with an oligo dT 
After second strand synthesis, 
and the double-stranded DNA 
Amicon, Inc. Beverly, 



clso 
RNA 



(Life Technologies) > 
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Massachusetts. USA). Labeled cRNA was made directly 
step as described above. The total molar concentration c 
from the absoTbance at 260 axid assuming an average 
RNA concentration was calculated using the convention^ 
equivalent to 40 )ig of RNA, and that 1 pg of cellular m^NA 
molecules. 

Cellular mRNA was also labeled directly 
or RNA synthesis steps. Poly (A)* RNA was fragmentei i 
ends of the fiagments were kinased and then incubated 
oligoribonucleotide (S'-biotin-AAAAAAO*) in the 
Technologies). Altemaiively, mRNA was labeled directly 
psoralen derivative linked to biotin (Schleicha & Schue 1). 



from the cDNA pool with an IVT 
f labeled CRNA was determined 
size of 1000 ribonucleotides, 
conversion that 1 OD is 

consists of 3 pmoles of RNA 



without any intermediate cDNA 
as described above, and the S* 
o|vrenight with a biotinylated 

of T4 RNA ligase (Epicentre 
by UV-induced crosslinking to a 



spreseice 



oligonudi otide probes 



A high density mes of 20 mcr 
VLSIPS technology. The high dmsity array included 
in Table 2. A central mismatch control probe was provided 
resulting in a high density array containing over 16,000 



1th: 



was produced using 
oligonucleotide probes as listed 
for each gene-spedfic probe 
I lifTerent oligonucleotide probes. 
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T»ble 2. High density array design. For cvay probe tl)ere was also a mismatch control 

having a central 1 base mismatch. 



Probe Type 



Target Nucleic Ac d 



Test Probes: 



House Keeixing Genes: 



Bacterial gene (sample 
preparation/ianiplification 
oootrol) 



The high density anay was synthesized o i a planar glass slide. 



C}Arr^Hybri£g0lkm andScmiung. 

The RNA transcribed from cDNA was 
oligonucleotide probe aziay(s) at low stringency and thei i 
conditions. The hybridization solutions contained 0.9 M 
EDTA and 0.005 % Triton X-IOO . acQusted to pH 7.6 
addition, the solutions contained 0.5 mgMU uwlal^jf^ 
(Sigma Chemical Co., St Louis, Missouri, USA), Prior 
wexe heated in die hybridization soluticm to 9 "C for 10 
minutes, and allowed to equilibrate at room temperatue 
hybridization flow oeU» Following hybridization, the 
washed with 6xSSPE-T at 22°C for 7 minutes, and then 



Number of Probes 



691 
751 
361 
691 
481 
911 
661 
991 
641 
391 
158 
388 
669 
286 



hyjbridized to the high density 
washed under more stringent 
NaCl, 60 mM NaHaPO^, 6 mM 
feiicdtoas6xSSPE*T). In 
degraded hening spenn DNA 

o hybridization, RNA samples 
I unutes, placed on ice for 5 
)efore being placed in the 

was removed, the arrays were 
vashed with 0.5x SSPE-T at 
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40'*C for 15 minutes. When biottn-Iabeled RNA was 
stained with a streptavidin-phycocrythnn conjugate 
Oregon, USA) prior to reading. Hybridized anays wete 
phycoaythrin in 6xSSPE-T at 40'*C for 5 minutes. 

The arrays were read using scanning 
Dynamics, Sunnyvale, California, USA) modified for the 
argon ion laser as the excitation source, and the emission 
photomultiplier tube through either a 530 nm bandpass 
longpass filter (phycoeiydirin). 

Nucleic adds of either sense or antisense 
hybridization experiments. Airays with for either orientation 
each other) were made using the same set of photolithogn^hit 
order of the photochemical slqis and incorporating the 



the hybridized RNA was 
(Moldcular Probes, Inc., Eugene. 
SI ained with 2 ^g/ml strq^tavidin- 



microscope (Molecular 
nxrpose. The scanner uses an 
y vzs detected with a 

(fluorescein) or a 560 nm 



» filer 



oHentations were used in 

(reverse complements of 
c masks by reversing the 
iplementary nucleotide. 



» brig Iter 



viluest 



iarrsy 



»thit 



D) QuanAadve Anafy^ of HyhriditMlmii Pattern and tntendtieM, 

The quantitative analysis of the hybridizati 
instances in v^ch the perfect match probe (PM) was 
mismatch probe (MM), averaging the dififerences (PM n 
{Le., probe collection for each gene), and oonqnring the 
side-by-side experiment on an identically synthesized 
applicable). The advantage of the diffnence method is 
hybridization contribute equally, on average, to the PM 
hybridization contributes more to the PM probes. By 
the real signals add constnictiveiy while the contributions| from 
cancel. 

The magnitude of die changes in the 
values was interpreted by comparison with the results 
signal observed for the internal standard bacterial RNA 
amount Analysis was performed using algorithms and 



; average 



PCT/DS97/D1603 



)n re&iilts involved counting the 
than the corresponding 
MM) for each probe femily 
those obtained in a 
with an unspiked sample (if 
signals from random cross 
MM probes while specific 
the painvise differences, 
cross hybridizatttm tend to 



[aid 



avenging 



of the difference (PM-MM) 
of ^piking experiments as well as the 
into each sample at a known 
software described herein. 
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lo order to optimize probe selection for 
density array of oligonucleotide probes was hybridized 
transcribed from each of the target genes. Fluorescence 
high density anay was detennined by g«»nwiwg the highbensity 
illuminated scanning confocal fluorescence microsoope 
system. 

Probes were then selected for further datal analysis 
First, in order to be counted, the difference in intensity 
corresponding mismatch probe had to exceed a threshok 
background, in this case). This eliminated from considei ation 
well and probes for v/tadtk the misiruUch control hybridiz bs 
the perfect match. 

The high density array was hybridized to 
principle, contains none of the sequences on the high dei sity 
oligonucleotide probes were chosen to be complementar ' 
sense RNA population should have been incapable of hy >rii 
the array. Where either a probe or its mismatch showed 
(100 counts above badcgroumQ it was not included in 

Then, die signal for a particular gene was 
(perfect match - mismatch control) for the selected probeb 



of the target genes, the high 
^th the mixture of labeled RN As 
intensity at each location on the 

array with a laser 
I onnected to a data acquisition 

is in a two-step procedure, 
between a probe and its 
limit (50 counts, or about half 
probes that did not hybridize 
at an intensity compaiable to 



cont lined 



> cytok ines. 



E) Results: The Wgk DeusifyAntty$ PfoMe ^fdfk andSeitsitive DeteeHon of 

Target Nucieic Adds, 

As explained above, the initial arrays 
that were complementary to 12 murine mRNAs - 9 
constitutively expressed genes (5-actin and glyceraldehy<ke 
rat cytokine and 1 bacterial gene (E. coli Uotin synthetas bioB) 
quantitation reference. The initial cTqioimeots with thes< ! 
designed to determine whether short i/i fi/u synthesized 
hybridize with sufificieot sensitivity and specificity to 
complex cellular RNA pc^nilatioa These anay s were 



labeled RNA sample which, in 

array. In this case, the 
to the sense RNA. Thus, an anti- 
dizing to any of the probes on 
signal above a threshold value 
subsequent analysis. 

x>unted as the average difference 
for each gene. 



more than 16,000 probes 
1 cytokine receptor, 2 
3-phosphate dehydrogeiuise) - i 
which serves as a 
relatively simple arrays were 
(^gonucleotides can be made to 
delectRNAsina 
ionally highly redundant. 



qw ntitativcly t 
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containing hundreds of oligonucleotide probes per RN^ 
determination of expression levels. This was done to i 
behavior of a large number of probes and develop geneijal 
selection of minimal probe sets for arrays covering 

The oUgonucieotide anays contained 
of RNAs being monitored. Each probe pair consist^ 
complementary (referred to as a perfect match, or PM 
particular message, and a companion that was identical 
in a central position. The mismatch (MM) probe of eac[i 
for hybridization specificity. The analysis of PM/MM 
hybridization patterns from rare RN As to be 
presence of crosshybridization signals. 

For array hybridization experiments, 
prepared from individual clones, cloned CDNA librarieL 
as described above. Target RNA for array hybridizatio i 
fluorescently labeled ribonucleotides in an in vitro 
randtmily fragmenting the RN A to an average size of 3|> 
hybridized to arrays in a self-contained flow cell 
30 minutes to 22 hours. Fluorescence imaging of the 
Birgnnmg confocal micToscope (Molecular Dynamics), 
resolution of 1 1 .25 ^m (~ 80-fold oversampling in eaci i 
regions) in less than 1 5 minutes, yielding a r^id and q 
in^vidual hybridization reactions. 



J) Specifleify of Hybridization 
In order to evaluate the specificity ol 
described above was hybridized with SO pM of the 
IL-6, Actin, GAPDH and Bio B or IL-IO, IL-1^>40. 
and Bio B. The hybridized array showed strong 
nuclnc aads with minimal cross hybridizatioiL 



if h>bri< 



, many more than necessary for the 
investigate the hybridization 
sequence rules for a priori 
substantially larger numbers of genes, 
collections of pairs of probes for each 
of a 20-mer that was perfectly 
p lobe) to a subsequence of a 
except for a single base difference 
pair served as an internal control 
] airs allowed low intensity 
sensitively and accurately recognized in the 

labeled RNA target samples were 
or directly from ceUular mRNA 
was prepared by incorporating 
(IVT) reaction and then 
• 100 bases. Samples were 
-»200 pL) for times ranging from 
was accomplished with a 
The entire array was read at a 
of the 100 X 100 )im synthesis 
lantitative measure of each of the 



tran cnption ( 



I (volur le 



a rays 1 



idization, the high density array 
sense strand of 11^2, IL-3, IL-4, 
G|M-CSF, IFN-y. TNF-o, mCTLA8 
of the test target 



specii ic agnals fi>r each < 
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of the cytokines being moniu iredL only 



1 anneali d. 



expei uncDt i 



2) Detection of Gene Expression level t 
To determine bow well individual RN> i 
presence of total mftmmalian ceil message populatioa 
out Knownaroountsof individual RNAtaigets were 
5 from a representative cDNA libiBoy made from the 
line was chosen because ol 
detectable level 

Because simply spiking the RNA 
then immediately hybridizing might provide an aitiru 
10 test of the mixture, the spiked sample was treated to a 
differences between the library RNA and the added 
the sample y/tdch was then heated to 37*'C and 
thawed, boiled for 5 mmutes, cooled on ice and alk>w^ 
before peifonuing tfie iqfbndizBtion. 
15 Figure 2A shows the results of an 

were spiked into the total RNA pool at a level of 1:! 
copies per cell). RNA frequencies are given as the 
per mole of total RNA. Hgure 2B shows a small 
2A) containing probes q)ecific for interleukin<2 and 
20 and Figure 2C shows the same region in the absence or 
hybridization signals are specific as indicated by the 
unspiked images, and perfect match (PM) by bridizatiojis 
missmatches (MM) as 

PM probes) and darker rows (corresponding to MM 
25 among the different perfect match hybridization signal i 
reflects the sequence dependence of the hybridizations 
match (PM) probe was not significantly brighter dian 
of cross^hybridization with other members of the 
patterns are highly reproducible and because 
30 probe per RNA, infrequent cross hybridization ol 
accurate detection of even low level RNAS 



in a conqflex target sample, 
targets could be detected in the 
spiking experiments were carried 
spiked into labeled RNA derived 
B cell line TIG. Hie TIO cell 
IL-10 is expressed at a 



L mixtilre with the selected target genes and 
artiitci ally elevated reading relative to the 
series of procedures to mitigate 
R}HA. Hius the "spike" was added to 
The sample was then frozen, 
to return to room temperature 



1:30(0 
imo ar 



\ shown try the pattern of ahemat ng brighter 

pr)bes). 



; detection does 



in whidi 13 target RNAS 
(equivalent to a few hundred 
amount of an individual RNA 
porti m of the array (the boxed region of 
in|crleukin-3 (IL-2 and IL-3,) RNA, 

the spiked targets. The 
C(}mparison between the spiked and 
are well discriminated from 

rows (corresponding to 
. The observed variation 
was highly reproducible and 
In a few instances, the perfect 
I mismatch (MM) partner because 
comfjlex RNA population. Because the 
not depend on only a single 
4 this i|/pe did not preclude sensitive and 
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SimUariy» infiequeot poor hybridization 
secondary structure, the presence of polymoir^sm or 
preclude detection. An analysis of the observed patterns 
hybridization led to the formulation of general rules 
probes with the best sensitivity and s^ficity described 



to, for example, RN A or probe 

database sequence errors does not 
of hybridization and cross 
for ^ selection of oligonucleotide 
hereuL 



r Concern ration 



was carried c 



fiomt 



3) Relationship between Target 
A second set of apiking experiments 
of concentrations over which hylvidization signals coul< i 
10 RNA levels. Figure 3 shows the results of experiments 
were spiked together into 0.05 mg/ml of labeled RNA 
al levels ranging from 1 :300 to 1 :300,000. A frequency 
present at less than a few copies per cell. In 10 ^g 
frequency of 1 :300,000 oorresponds to a 
IS 0.1 femptomole 6 x 10^ molecules or about 

Hybridizations were carried out in parallel 
presence of each of the 10 cytokine RNAs was re 
background even at the lowest fiequendes. FuitfaeimoiB, 
lineariy related to RNA target coixsentration between 
20 Between 1:3000 and 1:300, the signals increased by a 
the probe sites were beginning to saturate at the higher 
hour hybridization. The linear response range can be 
reducing the hybridization time. Short and 
quantitativdy cover more than a 1 0*-fold range in RN> . 
25 Blind spiking experiments were 

simultaneously detect and quantitate multiple related 
concentrations in a complex RNA populatian. Aseto 
contained 0.05 mg/ml of sense RNA transcribed from 
plus combinations of the 10 cytokine RNAs each at a 
30 cytokine RNAs were spiked at one of the following 

or 1 :300. The four samples plus an unspiked refercnc i 



r of to al 



30 picogi ams)of specific 



l:300,( 



Hong hybridizations 



; perfom ted 



andHybridixation Signal 
out to determine the range 
be used for direct quantitation of 
n which the ten cytokine RNAs 

the B cell (TIO) cDNA library 
of 1 :300,000 is that of an mRSA 
RNA and a volume of 200 a 
f approximately 0.5 picomolar and 
tcRNA. 
al 40"*C for 15 to 16 hours. The 
>ly detected above the 
the hybridization intensity was 
1,000 and 1:3000 (Figure 3). 
fkctor of 4 • 5 rather than 10 because 
x)ncentrations in the course of a 15 
e}ctended to higher concentrations by 
can be combined to 



concentration, 
to test the ability to 
UNAS present at a wide range of 
four samples was pr ep ar ed that 
he murine B cell CDNA litnary , 
(lifferent concentration. Individual 
: 0, 1.300,000, 1:30,000, 1:3000, 
woe hyteidized to separate arrays 



le rels: 
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det mnined, 1 



for 1 5 hours at 40 "^C. The presence or absence of an 
pattern of hybridizatioQ and bow it differed from that 
concentrations were detected by the intensities. The 
cytokines in the four blind samples were conectly 
false negatives. 

One case is especially noteworthy: IL- 
used to make the CDNA libraiy, and was known to 
of 1 :60,000 to 1 :30,000. In one of the unknowns, an 
(coiTcsponding to a frequency of 1 :300,000) was 
the spiked IL-10 RNA was correctly determined, even 
only 10 - 20% above the intrinsic level. These results 
expression are sensitively detennined by performing s|de-by 
identically prepared sanq)les on identically syntfiesiin t 



] (NA target was detennined by the 
)f the unspiked refetmce, and the 
cpncentiations of each of die ten 

with no folse positives or 



s spiked 



10 is expressed in the mouse B cells 
in the library at a frequency 
fl|dditional amount of IL-10 RNA 
into the sample. The amount of 
though it represented an increase of 
indicate that subtle changes in 
•side experiments with 
anays. 



Example 2 
T Cell Induction Experiments Measuring 

of Time Following Stitkulation. 

The high density anays of this inventioi i 
MRNA levels in murine T celbat diilerem times folio wing 
from the murine T helper cell line (2D6) were treated 
4-|^x>rbol- 1 2-myii5tate 1 3-acetate (PMA) and a calciufn 
then isohited at 0, 2, 6 and 24 hours after stimulation, 
^g) was converted to labeled airtisense RNA using a 
double^slranded cDN A synthesis step with a 5ubsequei|t 
This RNA synthesis and labeling procedure amplifies 
to SO-fold in an appaientiy unbiased and reproducible 

The labeled antiseose T-cell RNA 
hybridizBd to DNA probe arrays fi)r 2 and 22 hours. A 
mRNA level was obsenred, along with significant char ges 
(IL-3,IL-iO,GM-CSFandTNFa). As shown in Figu b 



Cy tokine mRNAs as a Function 



Xhct 



Lixom the 



were next used to monitor cytokine 
a biochenucal stimulus. Cells 
>|dtb the phoibol ester 

ionopbore. Poly (A)* MRNA was 
Isolated mRNA (approximately I 
pijocedure that combmes a 

in vitro transcription reaction, 
entire mRNA population by 20 
i^hion (Table 2). 

four time points was then 
large increase m the Y-interferon 

in four other cytokine mRNAs 
4, the cytokiDe messages were not 



wo 97/27317 



FCT/US97/01tiO3 



induced with identical kinetics. Changes in cytokine 
were unambiguously detected along with the very 

These results highlight the value of the 
inherent in the method. The quantitative assessment 
resuhs b direct, with no additional control faybridizBti< 
amplification, cloning or sequencing. The method is 
protocols, instrumentation and analysis software, a singl 
read and analyze as many as 30 arrays in a day, 



mflJ^A levels of less than 1 : 1 30,000 
jes observed for Y-interferon. 
latge experimental dynamic range 
of |(NA levels from the hybridization 
sample manipulation, 
efficient Using current 
user with a single scanner can 



Lon^ 



als> 



Example 3 

Highfr-Densiiy Arrays Containing 65,000 inches for Over 100 Murine 



65,000 different oligonucleotide 
an entire murine B ceU RN A 
re^lution of 7.S lim in less than 
including 12 murine genes 
.C §1020 additional murine 
sre approximately 300 probe pairs 
rul^ described hereiiL The probes 
entl of the translated region of each 
detected in the B cell RNA 
300 000 to 1:100. 
in luction experiments (Fig. 4) were 
similar results were obtained for 
ion changes were unambiguously 
tH>se shown in Figure 4. 

of probes per gene are sufficient 
fn^ the 118 geoe chip were analyzed 
' fbat is to say, the data were 
^ergene. The ten subsets of 20 
per gene on the arrays. The 



Figure 5 shows an array that contains 
probes (SO ^m feature size) following hybridization wit! \ 
population. Anaysof this complexity were read at a 
fifteen minutes. The array contains probes for 118 gene^ 
r e p re s ented on the simpler array described above, 35 U.l 
genes, three bacterial genes and one phage gene. There 
per gene, with the probes chosen using the selection 
were chosen from the 600 bases of sequence at the 3' 
gene. A total of 21 murine RNAs were unambiguously 
population, at levels ranging from iqiproxixnaftely 1 :: 

Labeled RNA samples from the T cell 
hybridized to these more Gonq>lex 1 1 8-gene arrays, and 
the set of genes in conmion to both chip types. Express^ 
observed for more than 20 other genes in addition to 

To determine ^^lether much smaller sets 
for reliable detection of RNAs, hybridization results 
using ten different subsets of 20 probe pairs per gene, 
analyzed as if the arrays contained only 20 probe pairs 
pairs were chosen fix>m the ^jproximately 3(X) probe 



paisf 
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initia] probe selection was made utilizing the probe 
described above. The tea subjects of 20 pain were 
probes that survived selection and pnming. Labeled 
cell RNA population at levels of 1 :25»000, 1 :S0,000 
hybridization signals for the spiked RNAs were 
with the smaller probe sets. As oqiected, the hyimdi 
tightly as v/bat averaging over larger numbers 
of 20 probe pairs per gene are sufficient for the 
levels, but that improvements in probe selection and 
preferred to routinely detect RNAs at the very lowest 
Such improvements include, but are not limited to 
coupled with use of slightly longer oligonucleotide 
progress. 



se ection ( 



th<a 



FNAs^ 



aiid 1 



consistently 



idiz itii 



and pruning algorithms 
randomly chosen finom those 

were spiked into the murine B 
1:100,000. Changes in 

detected at all three levels 
ion intensities do not duster as 
of probes. This analysis indicates that sets 
of expression changes at low 
eitperimental procedures will are 
1 ivels with such small probe sets, 
higl er stringency hybridizations 
pro^ (e.g., 25 mer probes)) are in 



Example 4 
Scale Up to Thousands 

A set of four high density anays each a 
probes ^iproximately 1650 different human genes pro> ided 
genes. There were about 20 probes for each gene, 
microns. This high density array was successfully hyb^dized 
essen t ially the protocols described above. Similar sets 
oligonucleotide probes to every known expressed 



iseqwnce 



Examples 

Direct Scale up for the Simultaneous Monitoring of Tens of Thousands 

ofRNAs. 

In addition to being sensitive, specific i 
described here is intrinsically parallel and readily s 
numbers of mRNAs. The number ofRNAs monitored 
decreasing the number of probes per RNA and increasii g t 
For example, using the aboveKlescribed technology, { 



:aid 



r scalal le 



of Genes 

itaining 25-mer oligonucleotide 
probes to a total of 6620 
re size on azrays was 50 
to a cDNA iibraiy using 
}f high density arrays containing 
tag (EST) are in preparation. 



quantitative, the approach 
to the monitoring of very large 
tan be increased greatly fay 
the number of probes per array, 
containing as many as 400,000 



aniys 
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inobes in an area of 1 .6 cm' (20 z 20 ^In synthesis 
read. Using 20 probe paifs per gene allows 10,000 
while maintmning the important advantages of probe 
anays could cover the more than 40,000 human genes 
sequence tags (ESTS) in the public data bases, and oev ' 
become available. Because of the combinatorial natun 
this complexity are made in the same amount of time 
simpler ones used here. The use of even fewer probes ber 
density makes possible the simuhaneous monitoring 
single, or small number of small chips. 

The quantitative monitoring of 
vwU prove valuable in elucidating gene function, explcjring 
disease, and for the discovery of potential therapeutic 
of genomic information gro^* bi^y parallel 
an efficient and direct way to use sequence informatio|i 

idiysiology of the cell 



are currently synthesized and 
to be monitored on a single anay 
redundancy. A set of four such 
hr which there are expressed 
ESTs can be incoipomted as they 
of the chemical synthesis, arrays of 
the same number of steps as the 
gene and arrays of higher 
sequenced human genes on a 



\ritht 



o alh 



f express] on 



1 methods of the 



leveb for large numbers of genes 
the causes and mechanisms of 
J md diagnostic targets. As the body 
type described here provide 
to help elucidate the underlying 



Example 6 

Probe Selecdon Using 
A neural net can be owned to predict 
hybridization intensities of a probe based on the 
probe properties. The neural net can then be used to 
probes. When a neural net was trained to do this it 
between predicted intensity and measured intensity, 
hybridization than hybridization. 



identify the hyl 



A) input/Ou^ui Mapping. 

The neural net was trained to i 
probes. The 204ner probes were mapped to an eighl|y 
four Ints representing the base iiK the first position 



a Neural Net 



ttt 



hybridizatioQ and cross 
of bases in the probe, or on other 
] tick an arbitrary numberof the "best" 

a moderate (0.7) correUtion 
vith a better model for cross 



sequ 3ice 



olte 



'bridization properties of 20-mer 
bit long ii^ut vector, with the first 
probe, the next four bits 
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representing the base in the second position, etc, 
follows: 

A: 1000 C: 0100 G: 0010 
The ncuxal netwQik produced two 
crosshybridizadon intensity. The output was scaled lin^y 
from the actual experiments fell in the range 0. to 1. 



Thus the four bases were encoded as 



outpi ts; hybridization 



B) NemlNetAFchHeetan. 

The neural net was abackpropagatton 
hidden layer of 20 neurons, and an output layer of two 
function was used: ( s(x) = 1/(1+ cxp(-l • x)) ) that 
non-linear (sigmoid) manner. 



n stwofk with 80 input neurons, one 

neurons. A sigmoid transfer 
scales the input values from 0 to 1 in a 



Q NeundNel Tndnlmg, 

The network was trained using the defaidt 
Professional 2.5 for a backprop network. (Neural Wor cs 
NeuialWare, Pittsburgh Pennsylvania, USA). The trailing 
8000 examples of probes, and the associated hybridiza ioi 
intensities. 



D) NeundNet W^lUs, 

Neural net weights are provided in two 
3) (wdgbts.l) and a 2 X 20 matrix Table 4 (weights.2 ). 



25 TaUe3.-Neuralnetweight5(81 x 20 matrix) (weights J). 



-0.0316746 -0,0263491 0.15907079 -0.035381(1 -0.0529314 0.09014647 

0.19370709 -0.0515666 0.06444275 -0,04808:16 0.29237783 -0.034054 

0.02240546 0.08460676 0.14313674 0.06798329 0.06746746 0.033717 

0.16692482 -0.0913482 0.05571244 0.22345543 0.04707823 -0.0035547 

0.02129388 0.12105247 0.1405973 -0.00663 i7 -0.0760119 0.11165894 

0.03684745 -0.0714359 0.02903421 0.09420238 0.12839544 0.08542864 

0.00603615 0.04986877 0.02134438 0.085225 9 0.13453935 0.03089394 

0.11111762 0.12571541 0.09278143 0.11373715 0.03250757 -0.0460193 

0.01354388 0.1131407 0.06123798 0.148184>4 0.07090721 0.05089445 



T: 0001 

intensity, and 
so that 95% of the outputs 



parameters from Neural Works 
Professional is a product of 

set consisted of approximately 
in and crosshybridization 



tnatrices; an 8 1 x 20 matrix (Table 
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-0.0635492 
0.18790121 
0.02378313 
^.0403537 
.0.0694051 

-0.0731941 
0.06500423 
0.03036973 
0.17097448 
0.05442215 
0.08683836 
0.08829379 
0.0749867 
0.05022619 
0.08078995 
0.03405219 
0.11517255 
0.14236264 
-0.0866363 

-0.0163019 
038030735 
0.23144296 
0.46158856 
0.45084599 
0-55080342 
0.36848074 
0.47133151 
0.46017882 
0.33042327 
0.19591335 
0.19672842 
0.1710967 
0.03326527 

-0.0752053 
-0.0838031 
-0.2039919 
-0.0482095 
-0.0065265 
0.09420983 
0.00565713 
0.11275655 
-0.0850152 
0.1065109 
-0.0922655 



-0.0227965 
0.09624594 
0.10295142 
0.23566079 
-0.0637478 

0.08858298 
0.11003297 
0.06836637 
-0.007098 
0.23686385 
0.14047802 
0.17881326 
0.08564588 
0.14544216 
-0.0022168 
0.06140256 
0.17431773 
0.17182963 
0.11008894 

0.06256609 
0.28241798 
-0.3207987 
0.20649959 
-0.5829023 
0.30968052 
-0.5196409 
0.30909833 
-0.5331213 
0.4072904 
-0.4028497 
0.16133355 
-0.2728708 
0.22045346 

-0.0571054 
0.01667063 
-0.0532526 
0.04316666 
-0.2011867 
-0.0010159 
-0.1990354 
0.01772332 
•0.1931012 
0.07205399 
-0.1478272 



0.1081195 

-0.0865264 

0.05553147 

0.10335726 

0.2687766= 



0.13419148 
-0.0126238 
-0.0193289 
0.07325625 



0.39719725 
0.0403917 
0.02345118 
-0.0348659 
0.01979881 
0.00982503 
0.12465772 
0.05334799 
0.03519877 
0.05439407 
0.01802093 
0.09664405 
0.02306779 
0.40543473= 

0.16058824 
0.2882407 
0.56366867 
0.35099933 
0.51297456 
0.54485208 
0.33829662 
0.37790757 
0.60684419 
0.24270254 
0.30585453 
0.21780767 
0.1234024 
0.98782647- 

-0.1834571 

-0.0945634 

^.0828366 

-0.1732933 

-0.0434558 

-0,1768979 

0.11568499 

-0.0016695 

0.08498721 

-0.1304159 

0.08858409 



-0.0709359 
0.02953459 
0.0206452 
0.09989586 
-9.80E-06 
0.11756061 
0.13134554 
0.14341639 
0.12799838 
-0.0789278 
0.0954654 
0.01782892 
-0.0489743 



0.1414949S 

-0.2227429 

0.3597671^ 

-0.5071837 

0J349462: 

-0.7155912 

0.2161247: 

-0.464661 

0.4758600! > 

-0.375077 

0.3589654J 

-0.241956: 

0.0698708 > 



0.14263187 

-0,1137057 

0.1373803 

0.0550463 

-0.036913 i 

-0-2365085 

-O.069O084 

-0.249011 

0.0367351 4 

-0.1723315 

0.142065^^1 



0.08916269 -0.010634 

0.11497019 -0.0057307 

-0.0627925 -0.024633 

0.11329328 0.2555581 



0.14039235 
0.26901209 
-0.0079707 
0.07417496 
-0-0549301 
0.09054346 
0.09500015 
0.11468539 
0.01427337 
0.07312368 
0.00130152 
0.03840308 
-0.0006051 



0.15698175 
0.34799534 
0.20325871 
0.56459975 
0.43086055 
0J0799151 
0.41646513 
0.50172138 
0.28597337 
0.14083703 
0.24851802 
0.17847325 
0.1741322 



0.23244983 

-0.0605089 

0.20967795 

-0.1236805 

0.08891765 

-0.028868 

0.04572553 

0.14277624 

0.16172577 

0-11417327 

-0.035995 

0.05180788 

0.19077648 



-0.1197781 

0.38490915 

-0.343972 

0.216O5791 

-0.5538613 

0.29871368 

-0.5573701 

0.21158406 

-03345993 

030998308 

-0.2937264 

0.07593013 

0.05922241 



-0.0715346 

-0.1040308 

-0.0562212 

-0.0526818 

-0.0196296 

-0.0150508 

-0.1509431 

0.09066539 

-0.1446398 

0.09151162 

-0.0314846 



-0.0524248 

0.04263301 

-0.2127942 

0.06739104 

-0.1314755 

0.14120786 

-0.0575663 

0.05357879 

-0.199778 

0.05596334 

-0.1985286 
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0.19862956 
^.0721622 
0.20344114 

0.02848385 

-0.0742345 

-0.0747124 

-0.1090026 

0.07326921 

-0.0586419 

0.05150735 

-0.0001517 

0.12470152 

-0.049451 

0.18880892 

0.0234996 

0.16712189 

-0.0487184 

0.146753 

-0.0739603 

-0.1548294 

-0.1728483 

-0.0950026 

0.07691807 

-0.035349 

0.04664604 

0.00194441 

0.13777457 

-0.0514387 

0.0994828 

-0.1571046 

0.13411179 

-0.0304715 

0.13328855 

0.0217719 

0.04117605 

0.08965616 

0.08670026 

0.00865917 

-0.1322389 

0.1623435 

-0.0887844 

0.08622501 

-0.0290916 

0.17453311 



-0.0502828 
-0.1506944 
^.061502 

0.00254791 

-0.0545447 

0.13325705 

-0.0988943 

0.02654305 

-0.08015 

-0.1449667 

-0.0521925 

-0J589714 

0.05717351 

-0J259364 

-0.1177034 

-0.0122822 

0.01467591 

-0.0931665 

0.17018235 

-0.0908961 

0.12621336 

-0.1562225 

0.13016214 

-0.302975 

0.08887579 

-0.1631221 

0.00339417 

-0.0722146 

-0.035077 

-0.1713289 

-0.0159559 

-0.0845574 
-0.1492282 
-0JI02229 
0.03997391 
^.1572192 
0.03785197 
-0.2995701 
0.21433547 
-0.3362183 
0.07691832 
-0.2421202 
-0.0839412 
-0.1529943 



-0.11447 

0.14910588 

-0.1647823= 

-0.0646306 

•0.1119258 

-a0508435 

-0.0445145 

-0.1239398 

-0.0073617 

0.06144469 

0.21106339 

-0.0061972 

0,14784867 

0.04754021 

0.02549919 

-0.109654 

-0.0759871= 

-O.1475015 

-0.0636651 

-0.0415557 

-0.1321529 

^.0917397 

0.10801306 

0.03706082 

-0.0210248 

0.11259725 

-0.2007502 

0.07706029 

-0.106266 

0.14155054 

-0.1296399= 

0.17682472 
0.11350834 
0.18922243 
0.06022124 
0.00942572 
0.21052985 
-0.0835971 
0.08046963 
-0.1335399 
0.11459036 
0.00845924 
0.10590381 
0.02726452 



in 

-0.144007! 
0.0329721 ) 



0.02634032 

0.107653 H 

-0.1761451 

0.0380297 P 

0.03043281 

-0.168288( 

0.1005446 

-0.4393073 

0.07370331! 

-0.3082401 

^.057658*? 

-0.167107^ 

-0.032736^ 



0.0728498: 

0.0469337? 

0.0491511: 

-0.1091831 

0.1871132^ 

-0.3151104 

0.12322487 

-0.1427284 

-0.0984519 

-0.0703103 

0.04593663 

-0.059766 

0.00283311 



-0.0552084 

-0.1121938 

-0.0940011 

-0.1808036 

0.07957069 

-0.3564453 

0.14536868 

-0.1548838 

0.10284293 

-0.056257 

-0.0151014 

-0.1593935 

0.06178628 



0.01366408 
-0.0266356 



-0.0654473 

•O.0606677 

-0.0883804 

-0.0484086 

0.09781751 

0.00400978 

0.22570252 

0.0053312 

0.25447422 

0.01207511 

0.02376083 

0.00582423 

0.01481733 



-0.0609536 

-0.2586751 

-0.0436857 

-0.0989133 

0.04599057 

0.0105284 

0.07198878 

0.09078772 

-0.0939511 

0.1548807 

-0.2334163 

0.13616422 

0.01067419 



0.07044557 

0.02089526 

0.08787836 

0.04742034 

0.12980177 

0.01492627 

0.08446889 

-0.021533 

0.16658102 

0.01970494 

0.19088623 

-0.0399097 

0.06624542 



0.11101657 
-0.2501774 



0.04731949 

0.05693235 

-0.0777852 

-0.0337959 

0.02590732 

0.01282504 

-0.3763289 

0.13283829 

-0.3289591 

-0.1141143 

-0.2828108 

-0.0715723 

-0.0636454 



-0.0945313 

0.15550844 

-0.031472 

0.0294641 

-0.2039073 

0.10938062 

-0.2535323 

0.08646259 

-0^18395 

0.13540466 

-0.0250262 

0.22308858 

-0.360891 



-0.1482136 

0.00104415 

-0.1835242 

-0.0744867 

-0.2440033 

0.04286519 

-0.1689682 

0.0558197 

-0J004514 

0.08940192 

^.1967196 

•0.0861852 

0.01004315 
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-0.158326 -0.0149114 -0.1479269= 



0.11429903 

0.33529782 

0.09062219 

0.49959001 

0.52447188 

0.78773916 

0.472%903 

0.80210346 

0.45408139 

0-56882453 

0.4490836 

0.39600489 

0.35209069 

0.13266218 

-0.0112394 

0.09505147 

0.16811867 

-0.0380792 

0.42699403 

-0.0381787 

0.74187845 

0.02410492 

0.7597335 

0.15992545 

0.25537059 

-0.1888455 

0.08058657 

-0.2839164 

-0,1147067 
-0.5139894 
0.19038832 
-0.9312411 
-0.5433689 
-0,9422795 
-0.6896155 
-1.0231192 
-0.6568274 
-0.7811472 
-0.4704399 
-0.7735854 
-0.1544528 
-0.4815812 



-0.0432327 

0.24581231 

-0.2974442 

0.22195752 

-0.5555881 

0.45518181 

-0.672706 

0.40167108 

-0.7316507 

0.29653791 

-0.4754149 

0.24787127 

-0.203685 

0.20236486 

0.01601524 
-0.0220034 
-0.4498019 
-0.0468904 
^).6348S44 
0.09532065 
-0.8996705 
-0.0632124 
-0.6287012 
-0.1780757 
-0.4526066 
0.1974159 
-0.0768841 
0.12684187 

-0.0084124 

-0.6221746 

0,55414283 

-0.410718 

0.92539561 

-0.6914638 

1.1251011 

-0.5556009 

1.1967098 

-0.5740913 

0.51728982 

-0.3031097 

0.2042688 

-0.5319371 



0.14520219 

0.07311282 

0.46336258 

0.32254469 

0.68481833 

0.71273196 

0.69020337 

0.50383294 

0.48975253 

0.4472059 

0.46366793 

0.20359448 

0.25115264 

1.1078833= 

0.11363719 

0.0714381 

0.10313182 

0.37975076 

0.00025528 

0.50065184 

0.03180836 

0.73732454 

0.03615654 

0.3820785 

-0.0761788 

0.01620384 

-0.316401 

-0.2450078= 

-0.5239977 

-0.3979228 

-1.1652025 

-0.1498093 

-0.9013531 

-0.7839714 

-0.8161536 

-0.7499282 

-1.150661 

-0.4527726 

-0.545236 

-0.4083092 

-0.8989772 

-1.3798244= 



0.5186048: 

-02268714 

0.1714583< 

-0,4994924 

0.2025146i 

-0.7655811 

0.37193871 

-0.6195157 

0.4798485< 

-0.5177853 

0.3137858: 

-0.203447 

0.2131310< 



-0.144006S 

-0,1994763 

-0.014999'3 

-0.7120741 

0.062027Oi 

-0.74135r 

0.0401035 1 

-0.81888K 

-0.124824] 

-0.564246: 

-0,024251^ 

-0.130653: 

0.0977949$ 



-0.502159 

0.30136263 

-0.3686967 

0.55332947 

-0.614531 } 

1.4393494 

-0.8204682 

1.281976 

-0.5503615 

0.649117* 5 

-0.8311051 

-0.0152683 

-0.3088974 



0.19151463 

0.31717882 

0.32802406 

0.75497276 

0.39860719 

0.7155844 

0.47959387 

0.80366057 

033738744 

0.36228263 

0.48470935 

0.25734761 

0.12461348 



0.05522444 
0.12304886 
0.47659361 
-0.1078557 
0.57867163 
-0.0193744 
0.82366729 
0.04538922 
0.56647652 
-0.0609947 
0.35473567 
-0.1468564 
0.08537519 



0.02636886 

-0.742976 

-0.4750175 

-1.0870041 

^.5512772 

-0.7092296 

-0.8957642 

^.9347371 

-0.6640182 

-0.6970047 

-0.4240301 

-0.2330878 

-0.2014994 



-0.1127352 

0.35736522 

-0.3898261 

0.35112098 

-0.7198414 

0.39701831 

-0.9032337 

0.3884458 

-0.5510914 

0.40129057 

-0,2453159 

0.17168433 

0.10632347 



-0.0711868 
-0.1611445 
-0.4639786 
0.10635795 
-0.6733171 
-0.1180785 
-0.6429569 
-0.1471086 
-0.6294683 
-0.0350918 
-03512402 
0.25235301 
-0.0738487 



0.1470097 

-0,4011821 

0.54713631 

-0.4378341 

1,0310978 

-0.894987 

1.3315079 

-0.6562014 

0.84698498 

-0.57S9697 

0.37167478 

-0.5839304 

0.11505035 
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0.07143499 
0.1549352 
0.44703272 
0^95928 
S 0.53066176 
0.1702383 
0.5403164 
-0.092208 
0.22238699 
10 -0.180493 
0.17421109 
-0.1982318 
0.05979542 
-0,1978694 

15 

0.06230025 
0.1073643 
-0.0272076 
0.33091745 

20 0.01069087 
0.11231339 
-0.0213237 
0.12980145 
0.15833771 

25 -0.0696303 
0.05538817 
-0.0677462 
0.14652038 
-0.1929855 

30 

-0.0786668 
-0.0029815 
-0.1259448 
-0.1091328 

35 0.00333312 
0.14768517 
0.061 1263 
0.09951859 
0.05554885 

40 0.01806S34 
0.10942505 
^).0365961 
0.01934035 
-0.0525739 

45 

0.19904579 
0.01703933 



-0.1589592 

-O.0608833 

-0.6194252 

-0.119705 

-0.9705743 

0.02221953 

-0.5077381 

0.21902563 

-0.156256 

0.17164391 

-0.0730809 

0.06996673 

-0.0623277 

0.051 19598 

-0.0752745 

-0.090154 

-0.1014201 

-0-0610701 

0.02569587 

-0.0392407 

-0.02616% 

-0.038394 

0.01835199 

0.03802699 

0,01067943 

-0.0772208 

0.06084725 

0.00694158 

0.05454836 

-0.0837616 

-0.0845026 

0.0090488 

-0.2812204 

0.02989549 

-0.1895157 

0.14843601 

-0.3743193 

0.09599103 

•0.0473638 

-0.0962418 

-0.0073082 

0.06086259 

•0.2001437 
0.06875326 



0.04816094 

0^1059546 

0.19459446 

0.4913742 

0.1324198 

0.44412452 

0.00849557 

0.25788471 

-0.2092034 

0.15690604 

-0.3717274 

0.19735655 

-0.2521037 

-0.2067173= 

0.32974288 

-0.0938452 

0.19723812 

0.01335303 

0.11676744 

0.06117272 

0.09474246 

0.08167668 

0.04420554 

0.0806741 

0.04131892 

0.16641215 

-0.1150111 

0.26604816=^ 

-0.0834711 
0.02468397 
0.10171869 
0.06142418 
0.02039073 
0.09454407 
0.08583955 
0.12351749 
•0.0205463 
-0.0570596 
0.01151769 
0.01007566 
-0.0489736 
-0.1788069=^ 



-0.0301291 

-0.4705076 

-0.052389* 

-0.845500 S 

0.089829: 1 

-0.770024 ♦ 

0.161140! 

-0J86151 ) 

0.1645882 

-0.0254561 

0.143643^ 

O.O562S506 

0.0944353 



0.00985043 

0.0070432* 

-0.093540 I 

0.021 5681 S 

-0.021313 

-0,0234321 

-0.010075(> 

-0.0105371 > 

0.02605363 

0.03993953 

-0.02676«> 

0.09142463 

-0.068787(i 



0.07707115 

0.0353179^ 

-0.054104: 

-0.167912 

-0,052828 

-0.186017(1 

0.0938281 2 

-0.1327621 

0.12675567 

-0.152338 

0.09737793 

-0.004975; I 

0.1045731 1 



0.04977471 0.2662821 7 
0.09066898 -0.200354!! 



0.15144217 

0,16360784 

OJl 194624 

0.15694356 

0,43900672 

0.10496679 

0.31764683 

-0.2022993 

0.20111787 

-0.1990184 

-0.0215865 

-0.241524 

-0.0492548 



0-07881941 

0.2569764 

0.0913924 

0.21619918 

0.1322203 

0-14693312 

0.10580003 

0.02142166 

0-27427858 

-0,0121658 

0-14418064 

0.02115551 

0.10878915 



0.05659099 

-0.1437671 

0.05257236 

-0.098868 

-0.0439769 

-0.0505908 

-0.0001466 

0.10949049 

0.0775801 

0.08384241 

0.07082167 

0.01404589 

-0.0520154 



0.19910193 
0.26507998 



-0.3037405 
-0.0684895 
-0.8030509 
-0.0023983 
-0.8588745 
0.14J37991 
-0.5240273 
0.13711917 
-0.1418906 
0.10211211 
-0.2363243 
0.12768924 
0.05238663 



-0.0835249 

0.08700065 

-0.0728388 

-0.0909865 

0.11848255 

0.13509636 

-0.0147534 

-0.0161705 

0-05774866 

0.07568218 

0.0897231 

-0.0876383 

0.32776353 



-0.0285798 

0-10122854 

0.04065102 

0.02574896 

-0.0458286 

0.088718 

-0.4065202 

0.07129322 

-0.1869074 

0.00704122 

-0.2184597 

-0.0406134 

-0.0454775 



0.15184447 
0.0629771 
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039202845 -0.6033413 

0.65535748 0J2430753 

0.95144385 -1.2075449 

0.99852085 0.48870567 

1.2572207 -1J854638 

0.73526824 0.31977594 

0.95438999 -1.2543333 

0.45917389 0.27823627 

0.33946255 -0.5412283 

0.19083619 0.37056214 

0.30190364 -0.3655235 

0.18584418 0.34009755 

0.13698889 -0.0798945 

0.31540671 0.08274947 

0.001195)8 -0.1978176 

0.74023747 0.38564634 

0.06987014 -0.5168169 

0.7840901 0.4372991 

0.05702339 -0.5161278 

0.7051%74 0.15731441 

0.11747536 -0.612968 

0.81293154 0.18651071 

0.24770954 -0.4320194 
0-54755467 0.08819038 
0.03049339 -0.1913544 
0.1008145 0.01412579 

-0.0048454 0.1204864 
-0.0273505 0.10494121 
0.1325469 0.15324508 
-0.0007111 0.13285491 
-0.118853 0.26435438 
0.07947435 0.07329605 
-0.162177 0.18712705 
0.04106503 0.08498254 
-0.0012895 0.2371086 
0.13412228 0.10756335 
-0.142963 0.09792294 
0.19903891 0.02989559 
-0.0027455 0.16604523 
0.25000233 0.05931267 



0.04679342 0.10158926 
-0.1704439 0.302394 
-0.215752 0.32740423 



/AO 

0.57940209 -0.0460919 

0.64831889 -1.0950515 

0.94851351 -0.0852669 

1.7470727 -1.7586045 

0.89351815 039586932 

1.2270083 -1.2818555 

0.55854511 0.1672449 

0.26928344 -0.9804664 

0.1085042 0.44658452 

024114503 -0.3020035 

0.33355939 0.44246852 
4.5490937« 

0.3366704 0.17313539 

0.11212139 -0.428847 

0.59532708 -0.0309942 

0.03748908 -0,6475483 

1.0081589 -0.0517421 

0.13783893 -0.8574924 

0.66693234 -0,0496743 

0.08724558 -0,7325026 

0.98160452 0.02407174 

0.03182137 -0.7051651 

0.72470272 0.12951751 

0J2105552 -0.3489864 

0.4782092 -0.098419 
0.42727205= 

0.15507312 0.25648347 

0.1988914 0.09454013 
-0.01398 0.08281901 
-0.1658676 0.25348473 
^.0775707 0.09143513 
-0.0903666 0.10754076 
0.03216886 0.04698242 
^.0325038 0.29328787 
0.14713244 -0.053306 
-0.0486093 0.05799349 
0.06907349 0.05942665 
0.15750381 -0.0373194 
0.06245366 -0.0775013 
0.22881882= 



-0.122116 0.2349100? 
-0.0671487 03325144^ 
-0.1597161 0.1895090< 



0.534 19203 -0.7680888 

0,80829531 0.05049393 

0.94320357 -1.680338 

0.56886804 0.66196042 

1.586942 -1.6365775 

0.71813524 037488377 

0.56084049 -0.7980669 

0.62299174 033984308 

0.39120093 -0.5676367 

0.39015424 0.09788869 

0.17172456 -03479928 



0.01228174 -0.2679709 

0.57447821 -0.0305296 

-0.0107875 -0.7312108 

0,87958473 0.05327692 

0.08651814 -0.761238 

0.90612286 0.06334394 

0.07689167 -0.5775976 

0.65517086 0.29064488 

0.02613025 -0.677594 

0.89682412 0.181806 

0.14626819 -03964331 

0.4620938 0.06516677 

-0.0160188 0.07177288 



0.03982652 0.14641231 

-0.0560908 0.07466536 

0.07909692 0.36858437 

0.08835109 0.16466415 

-0.1019902 0.29236633 

0.04456592 0.18368921 

-0.0385783 0.2276271 

0.01249749 0.10016124 

-0.0808243 0.28909287 

0.21323961 -0.0118695 

-0.143813 0:21673524 

0.12471988 0.10462648 

-0.0160873 0.21550164 



-0,0625733 0.19985424 
-0.0581705 0.21095584 
-0.1232446 0.27883759 
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-0.0430407 
-0.1322077 
0.10109599 
-0.0808031 
0.13912162 
-0.2270383 
0.01596376 
-0.1284984 
6.00538179 
-0.0861699 
0.20031671 

0.37838998 
-0.038453 
0.58336282 
0.07741276 
.0.85640681 
-0.0642752 
0.79736245 
0.02592243 
0.88004726 
0J0492786 
0.54989374 
0.I7I51839 
0.51068121 
-0.1255455 



0.02952595 
-0.6262965 

30 0.02978903 
-0.7473708 
0.00290797 
-1.0829539 
0.06966544 

35 -0.8823278 
0.0915129 
-0.499517 
-0,1106236 
-0.6255118 

40 -0.1468192 
-0.213571 



0.06424081 
-0.1032737 
0.05533361 
0.05850215 
-0.012636 



0.04886867 

0.2981362 

0.23081669 

0.15750171 

0.04256131 

0.22945035 

0.03504543 

0.24145114 

0.05302088 

0.05814215 

0.23140682 

0.00934576 
0.24550894 
-0.2145292 
0.45081589 
-0.6068144 
0.37914035 
-0.7102081 
0J7O13471 
-0.6990998 
0J9735735 
-OJ660355 
0.39539635 
-0.35(^096 
0.35898197 

-0.0751979 

-0.1423945 

0.20563391 

-0.0415357 

0.6284017 

-0.1822221 

0.75524592 

-0.3404879 

0.44590429 

-0.4873153 

0.27437851 

-0.1046614 

-0.1719856 

-0.1335077 

-0.0978306 
O.U 563963 
-0.033985 
0.03830531 
-0.1925185 



-0.0914212 0.28I925|I4 



0.1254565 
-0.1617257 



0.15627012 
0.295087^3 



0.08072432 0.129906)1 



-0.1625126 0.252321 



0.18167619 0.00080916 

0.00964208 0.117578^9 

0.20540115 0.075808^3 

-0.1001294 0.275054 

0.21307872 0.013722t4 
0.16010799« 



-0.139213 

0J0729383 

-0.2378269 

0.65251595 

-0.1187844 

0.71409059 

0.14268413 

0.82774776 

0.23456772 

0J5497372 

0.1205707 

0.50465524 

-0.2094818 

0.79502285- 

-0.2556099 
-0.0537339 
-0.5457558 
0.18283925 
-0.6397845 
-0.1832336 
-0.9053063 
-0.0334436 
-0.7808504 
-0.2889721 
-0.6061368 
^.2710638 
-0.4140109 
-0.7155944 



0.298238: 8 

-0.28073«5 

0.259394<2 

-0.454313 

0359594: 8 

-0.718094 

0.413746:3 

-0.8136597 

0.24596012 

-4).6593497 

0.22377755 

-0.379128 

0.31471257 



-0.304091 7 
0.111893^2 
-0.366651 \ 
0.281534^9 
-0.560678 
0.493714^9 
-0.582697? 
0.501 304C 9 
-0.439962 J 
0.47303999 
-0.416652 J 
0.26425925 
-0.105829 > 



-0.1169782 0.1390949(3 

-0.0709175 -0.028875 

-0.049436 0.115206515 

-0.0893732 -0.0066427 

0.13028348 -0.0045111 
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0.05275658 
O.04U6358 
-0.0405337 
-0.1935954 
0.04736055 
-0.1253632 
-0.0230768 
-0.0932236 
0.22654785 
0.04515802 



0.40640026 
-0.0689575 
0.64761585 
-0.0671543 
0.71842372 
0.21169594 
0.75569016 
0.24068722 
0.67229778 
0.20656242 
0.46045718 
0.07184427 
0.18174268 



-0.0942183 

-0.3791296 

-0.1922515 

-0.7847292 

-0.1479581 

-0.6362705 

-0.114608 

-0.57275 

-0.1189605 

-0,4015501 

-0.0637606 

-0,4123208 

0.02873472 



0.21014904 
0.08507752 
-0.0497829 
0.29120663 
-0.0530935 
0.15695702 
0,04350457 
0.14288881 
0.02395938 
-0.0269269 



-0.067578 

0.26537073 

-0.3581158 

0.48592216 

-0.7140775 

0.27888221 

-0.7394939 

0.45081198 

-0.8148533 

0.3752968 

-0.519361 

0.36315975 

-0.1241962 



-0.0541431 

-0.3382006 

0.29512301 

-0.2313099 

0.57049137 

-0.2790937 

0.90401584 

-0.3842527 

0.59226018 

-0.2875251 

0.33875695 

-0.2157291 

-0.1210428 



-0.0838893 
-0.1718288 
-0.0279296 
0.06969514 
0.05260766 



-0.1300299 

-0,026291 

-0.0170352 

0.13403182 

-0.2759708 
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-0.0395793 
-0.0917266 
0.23327024 
0.10926479 
0.12219627 
-0.1091286 
-0.0210903 
-0.1233738 
0.06584878 



0.03069885 

-0.2185763 

-0.0898143 

4).l 167006 

0.05705986 

-0.075133 

0.11607172 

-0.0760847 

-0.0323083 



0.07913893 
0.04743406 
-0.0578982 
0,18223672 
-0.0505442 
0.02949276 
-0.0943146 
0.00098273 
^.0581293^ 



-0.1470363 
-0.0364127 
-0.2096201 
0.09710353 
-0.1334345 
-0.0217044 
-0.1014408 
0.07522969 



Table 4. Second neural nrt weighting matrix (2 x 2 1 ) (w sights.2). 



20 



25 



-0.5675537 
-0.5328685 
-0.209518 
2.0453076 



-0.6119734 
0.31165671 
1.6362301 
0.08412334 



0.55343837 0.68506879 

0.12712023 -1.7462951 

0.19891988 -4.0000067 

3.1242371 0.22860088 



0.20069507 0.26132998 

-0.9999997 -0.4128213 

-1,9999975 -0.2563241 
-0.1645829= 

-1.1869608 0.39551663 

0.0818732 6.111361 

-0.5605077 1.3601962 
1.6726165« 



FA Code for pmniiig the net 

Code for numing the neural net is |irovid4l below in Table 5 (neural_n.c) 
and Table 6 Gin_alg.c). 
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0.09080192 
0.00991712 
0,09257686 
0.03838636 
-0.0204458 
4).078292I 
0.02903902 
0.05794976 



0.19741131 
-0.2093729 
0.00566842 
-0.2026017 
0.01167099 
-0.1160332 
0.02963065 
-0.1959872 



-0.5071653 0.2793434 
-1.0000007 -0.6456627 
0.04389827 1.7597554 



0,38050765 0.40832204 
0.62210494 0.42921746 
1,7318885 -1.0558798 



Tables. Code for nnming the neural net (Qeural_jLc). 
30 

#define local &r 

#iDctude <windcywsJi> 

#iiicli]de<al]ocJP^ 

#include "utUsJi" 
35 #include <string.h> 

#include <ctypeJi> 

#include <Q{tdio.li> 

#indude <niathJi> 

#include <ixiemJi> 
40 #include "des^utii Ji" 

^include "chipwin.h" 

#include "iin.algJi'* 



45 



void iepoitProbleKn( char local ♦ message, short cirorCl ass); 
char iniFileNameO = "destgner.ini"; 
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static void s>gmoid( vector local * transfonnMe )( 
short i; 

fof( i » 0; i < tiaiisfonnMe->size; i++ ) 

tiBiisfoRnM&>valiies[i] = l/(l+cx|)(-l 

> 

static short ge(NuznCols(char &r ^ buffer){ 
shoit count = ]; 
f0T( buffer != 0; buffcrf + ) 

if( ♦buffer = M") count++; 
return count; 

} 

static short gcSNumRows(char &r ♦ buffer){ 
char &r ♦ last» &r * current; 
shoft count »-l; 
current = bufifer; 
do{ 

COUUt-H-; 

last — cuncnt; 

cuncnt = sticfat( last-f-l, 0 ); 
}wfaile( cuncnt > lastt-l ); 
return count; 

} 

static void readMatrix( matrix local * the^4at, char far * 
short ij; 
char &r * tenq>; 
temp ~ buffer; 



) 



foi< i = 0; i < theMat'->nuniRows; i++ ){ 

foi(j = 0; j <thcMat->numCols; j++){ 
wfai]e( isspace( •tanp ) |) (♦temp = 0 &[& 

sscanf( temp, "%r, &theMat->viiues[i 
\^e( !isspace(*temp)&& ♦teiap 

) 

) 



^fine MaxNumUnes (20) 
#define MaxLincSize (1024) 

short readNeuraINetWeig^its(matrix local ♦wcig^l, mLtrix local ♦weights2 
){ 

charfiv^bufiicr, 
int copiedLengtfa; 
^ort numCols, munRows; 



♦ transformMe->valucs[i])); 



buffer ){ 



♦(tcmp-I ) != 0 ) ) = temp+H-; 

ilUl); 
NO)temp++; 
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bufifcr = &icalloc( MaxNumUncs * MaxLineSize , 
if (buffer — NULL ){ crrorHwnd( "failed to 
FALSE;) 

copicdLength = GctPrivateProfilcStringTwdghts 
MaxNumLines * MaxLineSize, IniFileName); 

if[ copicdLeogth < 10 1| copicdLength >= (MaxN 

-10)){ 

enrorHwnd('*failed to read .ini file"); rctun^ 

) 

immCols » getNuiiiCols( bufiier ); 
numRows » getNumRows( buffer ); 
i{{ !allocateMathx( weightsl » numRows, numCoi i )) 
readMatrixC wdghtsl, buffer); 



sizieof( char ) ); 
allo<ble file reading = buffer"); return 



r,NULL,"\0\0". buffer, 
jmLines • MaxLineSize = 
FALSE; 



return FALSE; 



copicdLength = GctPrivateProfileString("weights 
MaxNumLines ♦ MaxLineSize, iniFilcNamc); 

if( copicdLength < 10 1| copicdLength >= (Max>{umLines 

-10)){ 

eiTorHwnd("feiled to read .ini file"); 
farfree( buffer ); 
return FALSE; 

} 

numCols = getNumCols( buffer ); 
numRows » getNumRows( buffer ); 

if( !allocalcMatrix( weights!, numRows, numCo s ))( farfinee( buffer ); return 
FALSE;} 

readMatrixC weights!, buffer ); 
&rfree(bufifer); 
return TRUE; 

) 

short nmForwanK vector local ^ioput, vector local *out]but, 

nMx l0(|al ^weightsl, matrix local 

*weight^){ 

vector hiddeoLayc^ 

ifl[ lallocateVectoK fthiddenLayer, (shortXwcigll^l->numRows +1) )) return 
FALSE; 

if(! vectorTmiesMatrixC input, &hiddenLayer,4dghtsl )){ 
fieeVectoK &biddenLayer ); return FALSE; 

} 

sigmoid( &hiddenLayer ); 
hiddenLayer.values[ hiddenLayer.si2e -1] ~ 1; 
if( !vectorTimesMatrix( &hiddenLayer, output, |weights2 ) ){ 
frccVectOT( &hiddeaLayer ); return FALSE; 

) 

fiveVectoK AhiddenLayer ); 
sigmoidC output); 



" T 



r.NULUnOU)", buffer, 
MaxLineSize 



wo 97/27317 



FCT/US97y0l«O3 



return TRUE; 



10 



15 



20 



25 



30 



35 



40 



45 



static vector irqnitVector^ (NULL, 0}, outputVector 
firstWeights » (NULL, 0, 0} . secondWeigbts » {NUL^ 

static short beenHcreDoneHiis » FALSE; 



static short ixiakeSureNetIsSetUp( void ){ 

if( beenHereDoneThis ) return TRUE; 
\%\\ 

iil !allocateVectox( &inputVector, firstWeightsjnumCols 
if( !aUocateVectoi( AoutpiitVector, : 



!readNeuralN^Weights( &firstWeights, &sc coiufWeights 



secondWeij [htsjiumRows 



beenHereDoneThis = TRUE; 
return TRUE; 



void removcNetFioinMcmoiyC void ) { 
freeVectoi( AinputVector ); i 

fireeMaKrix( &fiistWeights ); fieeMatrix( &seco[idWeights ); 
beenHeicDoiieThis -* FALSE; 



; fieeVector( ftou^N ctVector ); 



> 



short nnEstimateHybAiidXHybC float local * hyb, float 
probe){ 

short probeLength« i; 



ifl tmakeSurcNetlsSetUpO) return FALSE; 
probeLength » (shoitXstrlai( probe )); 
i£( (probeLengdi M + 1) !« inputVector^ ){ 
// repoitProblem(*NemaI net not set iq) to deal wi 
if( (probeLength ^4 + 1) > i^iutVcctor 
// repoitProblem( "probe being trimmed tc 

probeLength (shortXinputVectjsr. 

) 



} 

memset( ii^Vector. values, 0, inputVector.siz^ * sizcof( float)); 
inputVector.v8lues[inpntVectonsize-l] = 1; 
foi( i = 0; i < probeLength; i++ ) 

inputVector.values|i * 4 + looki]plndex( 
runForwatd( AinputVector, AoutputVector, 
♦hyb = oulputVector.valiies[0]; 
*xHyb = au^Vector.values[l]; 
return TRUE; 



{hOJLL, 0}; static matrix 
0.0); 



)) return = FALSE; 
)) return » FALSE; 
)) return » FALSE; 



local *xHyb. char ° local* 



h probes of this = length", 0); 

){ 

do annlysis", 1); 
'.size / 4); 



tolowei(probe[i] ))]» 1; 
AdistWei^its, &secondWeigbts); 



wo 97/27917 



10 



15 



20 



25 



30 



35 



40 



45 



Tabk 6w Code for nmning the neural net (lui_alg.c). 



lin_alg.c 

^include "utUsJi" 
#mcliidc"lm_alg.h" 
^include <alloch> 

short allocateMatnx( matrix local * theMat, short rows, s)u>rt columns) { 
short i; 

theMat->values - calloc( rows, sizeof ( float local * )); 

if( theMat->vatues » NULL ){ errorHwnd( "failld to allocate = matrix**); return 
FALSE;) 

fori i « 0; i < rows; i++ ){ 

theMal->values[i] - calloc< columns, sizet »f (float) ); 
if( theMal.>vahics[i] NULL ){ 

cnorHwnd ("failed to allocate matfix**); 
foi( -i;i>=0;i-) 

&ee( theMat->value5[i] ); 
retnm FALSE; 

} 



} 



xjlumns; 



theMat->numRows = rows; theMat->numCols • 
return TRUE; 

} short aUocateVectOT( vectcv local ^ tiie Vec, short co luijms) { 

theVec->vah]es » calloc( columns, sizeof ( float); 
if( theVec->vahies =NULL ) { enorHwnd( " feile tojallocate = vector"); return 
FALSE;} 

theVec->size = columns; 

return TRUE; 

} 

void £recVect07< vector local * thcVec )( 
&cc( thcVec->vahies ); 
theVcc->vali]es = NULL; 
tbeVec->size«0; 

) 

void freeMatrix( matrix local * theMflt){ 
short i; 

for( i = 0; i < thcMat->numRows; t++ ) 
frec( tbeMat>>valuesri] ); 

fiee( theMat->values ); 
thcMat->valucs = NULL; 
theMat->numRows = theMat->numCols = 0; 

) 
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float vDot( float local * inputl. float local * ixiput2, sh|>it size ){ 
float retumValue-O; 
short i; 

for( i = 0; i < size; rH-) 

return Vahie +^ input Ip] * iiiput2[i); 
return retumValue; 

) 

short vectorTiiiiesMatrix( vector local ^input, vector Icjcal ^output, 

matrix 1( cal *mat )( 

short i; 

ifl[ (input->si2e != tnat->numCols) || (output->sfze < mal->numRows) ){ 
em}rHwnd< "illegal multiply" ); 
return FALSE; 

} 

foi( i = 0; i < mat->numRows; i++ ) 

output->vaiues[i] » vDot( input->valuei mat->values[i], iiiput->size = 

return TRUE; 



); 

> 



produced by shuf [ting (randomizing) 



^ arrays 



Example 7 
Generic Difference Sclreening 

High density arrays comprising arbitrary 
for generic difTeience screening were 
in light-directed polymer synthesis. The resulting 
25 mer arbitrary probe oligonucleotides. The oligonucleotides 
single nucleotide at position 13. 

After hybridization, washing, staining, 
data files (containing infoimation regarding probe 
were created. 

Differences in intensity between the two 
probe pair K (^Ktoe K ranges fiom 1 to 34320) were 
intensity dififacnces between the oligonucleotides of p^ K for repli 
calculated as: 



(h£q>ha2Bnl) probe oligonucleotides 
the mft.clc< used 
contained more than 34.000 pairs 
in each pair differed by a 



and 



iden ity and hybridizatio 



scanning as described above, 
ion intensity) 



oligonucleotides comprising each 
c^culated. More specifically, the 
licate j of sample i was 



T " 
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where X is the hybnaizatum intensity, i indicates which 
2), and j indicates which replicate (in this case replicate 1 
the probe pair (in this case 1 . . . 34 and I indicates 
while 2 indicates the other member of the probe pair. 

Figures 16a and 16b and 16c illustrate 
and 2 of sample 1 (Fig. 16a, the normal cell line) and 
sample 2 (Fig. 16b, the tumor cell line) for each probe. 
(X|,krX„«)-(X,arX,aa) for to 34320 on the vcitica] 
axis. The two replicates were nonnalized based on the 
X,2u) for all probe pairs (i.e.. after normalization, the 
Similarly, Fig. 15b plots the value of (XinrXji^jHXjaki 
the two replicates bosed on the average ratio of (X2,br^: 
the differences between sample 1 and 2 averaged over thi 
calculates as iO^jw^^^Tn^HQ^tm^^tniV^) ^ 
samples based on the average ratio of [(X„„+ Xj^^^]f[^ 

Figures 17a, 17b, and 17c show the data 
relative change in hybridization intensities of replicate 1 
difference of each oligonucleotide pair. After 
above), the ratio is cnlnilaird as follows: If the absolute 
> 1. then the raiio={X„krX|iu)/(X,2krX,2k2) else the 
inverse). The ratio of replicate 1 and 2 of sample 2 for 
oligonucleotide pair, normalized, filtered, and plotted thi 
shown in Fig. 17b. The ratio is calculated as in Fig. 

sample I and sample 2 averaged over two replicates for 
oligonucleotide pair. The ratio is calculated as in Fig. 1 
of [(X„„+Xaa2y21/[(X,|„+X.aj)/2] and [(X„^,+X,; 
normalization as in Fig. 16cw 

The oligonucleotide pairs that show ^ 
between the two samples can be identified by sorting thi ; 
difference values. Tbe oligonucleotides that show the 



iple (in this case sample 1 or 
or two for each sample), and K is 
member of the probe pair. 



the differences! 



:,2u)/2l/[(X. 



between replicate 1 
replicate 1 and replicate 2 of 
Fig. 16a plots the value of 
axis and K on the horizontal 
ratio of (X„irX„B)/(X|2k,- 
ratio should a9)proxiinate 1). 
: C ,^>) after normalization between 
i)/(XittrXatt). Figure 16c plots 
two replicates. This value is 
ilization between the two 

filtered. Figure 16a shows the 
ind 2 of sample. 1 for the 

replicates (see 
falue of (X„„-X„«)/(X,2krXiaa) 
(Xm.-X,a2V(X,urX„a)(the 
difference of each 
same way as in Figure 17a is 

based on the absolute value of 
Fig. 17c shows the ratio of 
he difference of each 
a, but based on the absolute value 
aiw+XafcJ^] after 



bc^/een i 
Thus, 



a^ere^er 



average 



normalizal ion between r 



sratu^ 



rtle 



17a. but 



£ reatest differential hybridization 
observed hybridization ratio and 
largest change (increase or decrease) 
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can be readily seen from the ratio plot of samples 1 
not appear to be in the background noise. Based on the 
sequences, a gene or EST with the suspected sequence 
sequence databases , such as GENBANK, to deteimine 
and characterized. If the search is negative, 
cDNA or part of the cDNA directly from mRNA, cDN/l, 

From Figures 1 6a and 1 6b, it is observed 
show large dififerences between two replicates for the 
results from difTeiential expression in a given tissue, 
genes tfial are likely highly expressed, so the deviation c 
larger than those oligonucleotide pairs that bind to 
the standard deviation of the mean b proportional to the 
relative change between two samples is a better 
expression between two samples {see Fig. 17c). In oide 
oligonucleotide pairs are of greatest interest, the absolut^ 
could be combined into a scoring fimction. 

Increasing the number of related oligonudleotide 
redundancy) and enqiloyment of two-color hybridizatim /detectii 
help reduce the background variation. This allows mon 
differences and decreases the noise and occurrence 
used in this example is a small subset of all possible 25 
number of oligonucleotide pairs will greatly increase th^ ability 
of unknown sequences by allowing more 



, appropriate pnmers 



Tlese 



nucte otides 



(Fig. 1 7c). These differences do 
identified oligonucleotide pair 

can be searched for in the 
^ether the gene has been cloned 
can be made to obtain the 
, or from a cDNA library, 
that several oligonucleotide pairs 
sample. It is believed that this 
oligonucleotide pairs detect 
f replicates for these pairs are 

expressed at low levels (£e.. 
mean). That is also \^y the 

detect the differential 
to determine which 
and relative difference measures 



; of fa se 



5 complete cove age 



Nudeie Add End LaBfling 
Several RNA transcripts as well as a full 
were fragmented by heat in the presets of Mg^. A 
poly A) labeled with either fhiorcscein or biotin at the 5' 
fragmented RNA using RNA ligase under standard conclitions. 



pairs (increased 
ion schemes is expected to 
sensitive detection of small 

positives. The 25 mer array 
J ners, thus, increasing the total 

to detect changes in genes 
of the available sequence 



nRNA sample from mouse cells 
rib^A^ (deoxyribonucleic acid 6 mer 
end was then ligated to the 
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The labeling BppcattA to be efficient and 
using the labeled RNA as a probe ^vas similar to one 
during an in vitro transcription step. 
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t|ke hybridization pattern obtained 
obtained tising RNA that was labeled 



Exan^fle9 
QuanHfication of Labeling Sffideney 
Quantification of the labeling efficieitcy is 
experiments in which specific full-length unfiragmented 
total mRNA pool at different concentrations prior to the 
relative concentrations of the spiked RNA in the pool can 
hybridization to a high density anay of target ol 
above. This permits evaluation ofthe ability to 
concentration in the mRNA pool. 



accomplished by spiking 
^qxcies are spiked into the 
l-labeling procedure. The 
then be measured by 

as described 
RNA species at low 



RNA 
ead- 



»Iigonuclec tides prepared 2 



> detect paidcular 



l1(»» 



Example 10 
PGR Labeling of Nuclei^ Acids 
Pofymerase Cham Readion (PCR) 

20 yX PCR reactions substituted with 
conducted azid the quantity of each PCR product was 
Approximately 250 finoles of each PCR product was 
sepbaoyl column (cat # 27-5130-01) was prepared with 
g followed with a 200 /xl H2O wash and spin at 3000 x 
pooled PCR product was loaded and spun for 2 minute 

The column was discarded and the eluatc 

dryness. 



DPiase FngmentalwH 

The dried down PCR pool in was resusp nded 
DuPont End Labeling KH (cat #NELS24). 2.5MlCoq2 
added. Gibco BRL DNase 1 was dUuted 1o 0.25 U/^l 



polled. 



biotin-dUTP were 

with gel analysis. 
A Pharmacia S300 
a I minute prespin at 3000 x 
for I more minute. The 
at 3000 X g. 

was speed vacuumed to 



in 13 Ml H20fit)m NEN 
and 12.5 jul TdT buffer were 
lOmMTrispHS. 1^1 of 



T 
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diluted DNase was added to PCR product poof and iiKjubated 
denatured for 1 0 minutes at 99^Q and cooled to A'^C 



Terminal Trwuf erase (TdT) Labeiiag 

To the fingnientBd PCR pool, 2 ;il of enzyme (fiom NEN 
U/Ad stock) was added and 4 ^d NEN kit biotio-ddATF 
volume was 35 n\. and was incubated at BT'^C for 1 .5 



kit2 

was then added. The final 



sequea;:es 



Hyhriditation 

The 35 /4I labeled target was split into 
coding chip (GeneChip containing sense^strand 
and one for the non-coding (antisense) chip. 1 82.5 fA 
stock diluted 1 :2 using 10 mMTrispH 8) was added, 
final concentrati<m of 0.001%. In certain cxperhnents, 
oligonucleotide was added to the solution rather than al 

The mixnire was denatured at 95''C for 
chip cartridge and hybridized with mixing at 37*C for 



\d Washing 

The h3Mdi2ation solution was removed 
GeneChip system (Aflymetrix, Inc., Santa Clara, CA) 
rinsed widi 3 X with 6X SSPE /0.00]% Triton X-100 

A phycoexythrin stain solution was 
SSPE/0.001% Triton X-100 + 10 ti\ of 20 mg/ml 
phycoeryduin (Molecular Probes Cat H S866) + 4 /ul 
stock. 

The staining sohidon was added to the 
Cemperatune for 5 minutes. The staining solution was 
manually rinsed 3 X with washing buffer. 

The chip was washed on hybridization 
Afiymetrix. Inc.) using 6X SSPE/0.001% Triton X-100 



for 6 minutes at 37**C, 
The total volume was 29 mI- 



1 7.5 ^1 aliquots, (me for a 
and permutations thereof) 
df 2.5 M TMACl (Sigma 5 M 
Triton X-IOO was adddd to a 
4 /<1 of 100 nM control 
the stain step, 

i minutes, added directly to the 
minutes. 



(;0i 



from the flow cell used in the 
the chamber was manually 
remove TMACL 

as follows: 190^1 6X 
BSA + 0.4 /zl stock 

control oiigo 100 nM 



and 

t<i 



prep ired 
acetyl ated 



fli orescein < 



flow cell with mixing at room 
removed from the flow cell and 

sttation (the GeneChip system, 
at35"C. 9 fill/drain changes 
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ami lification i 



of liesh wash solution were used and scannhig took plac^ in this buffer. Target 
sequences were accurately identified in liiis e3q)eriment 

Example 11 
End Labeling PCR Prod^fd 
PGR product was fiagmented and 
Boefaringer Mannheim: After the PGR amplification, 5 \l 
run on a 1% agarose gel to estimate total yield of the 
fragment the DNA. the remaining 45 ^1 of sohition ^was 
(diluted in H20 to a final ooocemmtion of 5 U DN Ase 
minutesatai X. Hie DNAse was then heat kiUed for 1 
fiagmented DNA solution was then held at 4'*C until 
reaction. 

The terminal transferace reaction mixture 
PCR sample. 20 fiL SX terminal transferase reaction 
(final concentration 1.5 inM)» 1 |il of fluorescent 
(ddNTP final concentration 1 0 nM) and 2 jiL 
transferase (TdT, final concentration 50 U/reaction), and 

The reaction was incubated for 30 minut^ 
icacdon volume was then transfetred to a 1.7 ml tube, 
SSPE, 0.05% Triton hyb and scanned normally. 

Protocols for the SO PCR reaction arc 
matecials accompanying the GetteChip™ HTV PRT 
CA). 



using TdT from 
of a SO ^1 PCR reaction was 
icauon reaction. To 
(jombined with DWAse 1 

DNA) and reacted for 15 
) minutes at 95X. The 
for the terminal transferase 



Ireaiy 



consisted of the fiagmented 
;6)a 25 mMCoCl, 



.ofBoehriigcr 



"Assiy 



Exantple 12 
CAIP Improves Base Cdfling 
In certain fiagment end labeling expeiin 
calliqg in a GcDcChip system was improved when calf 
(CAIP) was used during fiagmentation with DNAse. 



ibuf er, 



dideox; rnudeotide triphosphates 



Mannheim terminal 
H3O up to 100 (li volume. 
at37"C. THe whole 
bJought up to 500 ^l with SX 



found in the instructional 
(Asymetrix, Sunnyvale, 



the accuracy of base 
ntestinal alkaline phosphatsae 

HgurelS. 



See 
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CIAP is usetiill in degrading any nucleoLtdes 
in any previous amplificatifm, transcription, and poiym aase 
reactions. Such degredation prevents tfac incorporation 
subsequent reactions* such as tailing and labeling reactipns 



Example 13. 
Post'Hybridnaiion End Libeling 

Post-hybridization end labeling experim aits 
hybridization of a target to a ptohc array in the GcncCh p 
labeled using terminal transfierase (shown as TdTase) sa 

Post-hybridization labeling was shown 
probe array (Chip) was pre-reacted as shown in Figures |20 

Figure 21 also shows the results of a DN \sc 
The various titration experiments are shown below in T^le 



Table 7. Hybridization TdTase cod labeling call accuracy, 
Ratio = I ^ of maximum to next highest calculated mta sities, 
« minimum of A, C. G, or T in tile set subtracted from 



Experiment 


Prc-rcact 






Accuracy 


HM207 
SUDNAse 


ddTTP = 1.8mM 
dTTP = 
TdTase^50U 
Temp — room T 
Tune = Ihr 


FTTC-dUTP = 5 n 
dATP = 50 nmol 
TdTase»50U 
Temp « room T 
Tlme«l hr 


nol 


At least one strand = 
100.0% 

Both strands » 91.3% 
GeneSeq Composite - 
NA 


HM217 
5UDNAse 


ddTTP= l.OmM 
dTTP = 3.0mM 
TdTasc = 12.5 U 
Temp ^ room T 
Time » overnight 


FTTC-dUTP = 0.5 
dATP = 5 nmol 
TdTasc = 5 U 
Temp = room T 
Time = 15 min 


nmol 


At least one strand = 
99.8% 

Both strands » 89.6% 
GeneSeq Compoate = 
99.2% 


HM220 
5UDNAse 


ddTTP== 1.8mM 
dTTP = 3.0mM 
T<rrase=12.5U 
Temp = 37'C 
Tinse ^ ovcnn^lit 


FTTC-dUTP = 0.5 
dATP = 5nmol 
TdTase = 5U 
Temp = 37"C 
Tuiie= ]5min 


nmol 


At least one strand = 
100.0% 

Both strands » 91.1% 
GeneSeq Composite = 
99.1% 



FCT/U597A)im 

that were not inooipoiated 
other polymerase 
of those nucleotides in 
for example. 



were performed. After 
system, the targets were 
shown in Figure 19. 
yield better results when the 
and21. 

titrations experiment. 
7. 



Accuracy is based on 
Calculated intensities 
Adjusted intensity. Adjusted 
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These results show that base calling accuracy can 
fragments. Such results fiuther demonstrate the utility 
herein. 

Other experiments have shown that lU of DNAst 
obtaining ideal fragment lengths. 



in4>act xl by the length of the target 



of the methods disclsoed 
is particularly useful in 



Example 14 
End-Labeling (Tailing) with 
The nucleic acids tailed with poly-A or poJy-A 
using methods similar to those set forth in Example 13 
poly-T, as shown in Figure 22. 



PofyT 
analogs (labeled or unlabeled) 
be labeled using labeled 



fhate 1 



Example 15. 
Synthesis of Fluorescent Tripho: 
To 0.5 ^moles (SO of a 10 mM solution) 
nucleotide triphosphate, 3'amino-3*( 
deoxyuridine trii*osphate p), in a 0.5 ml cpendorf tube 
aqueous solution of sodium bOTate, pH 7, S7^L 
wquiv) of a 1 00 mM solution of S-carboxyfluoresccin-X- 
mcclurc was vortexed briefly and allowed to stand at 
IShours. The sample was then purified by ii 
fluoresceinated derivatives Formula 3 or Formula 4, 



K3*deoxythymidinelriph 3sphate (1) 



troom 



ioD-exchan{ e HPLC 



.below, 



Labels 

of the amino-derivatized 
or 2*-axniiio-2'- 
wasadded25uLofnM 
fioUand88jiL(10jimol, 20 
HHS ester in methanol. The 
temperature in the dark for 

to afford the 
.in about 78-S4% yield. 



wo 97/27317 



FCTAJ597A»im 



P3O9 -i 





NH-CO -Fluorescein 



P3O9 




HO HN 



^Y''''^^^^'''^^ O- Fluorescein 



terminal transfeiasr (TdT). It is believed, however, tfaafj these 
sutstratcs for a polymerase, such as klenow fragment 



Exati^lelO 
Synthesis of aS'Triaane'3,S[2ti,4HJ'dii 

The analogs as-triaziiie-3,5[2H»4H]-dioie 
nucleotides {see. Fig. 23a) are synthestztd by methods sfmilar 
ctaL, Biocofy. Chent 2: 441 (1991). 

Other useful labeling reagents are sytfaesi zed 
U/dUTO or ddUTP. See for example Lopez-Canovas, 
189-192 r7W;; Li. el aL, C^ometryZ^i 172-180 (l|99S); 



are not substrates for 
molecules would be 



("6-a2a-pyrimidine") 

totfaoseusedby Petiie, 



including 5-bromo- 
^i^.,Arck Med. Res2S: 
1; Bou]twood,J.Etal.,y, 
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Pathol 148: 61 ft (1986); Tiaincard, et al.,ilwi 
Figures 23a, and 23b set forth herein. 

Details of the synthesis of nucleoside 
the above structures (in paiticular those of Fig. 23b) have 
literature Known im)cedciire8 can be applied in order to 
The linker modified nucleosides can then be converted t 
final attachment of the dye or hasten which can be cairie[l 
available activated derivatives. 

Other suitable labels inchide non-ribose 
containing structures some of which are illustrated in Fig . 
mideotide analogues such as are illustrated in Fig. 23d. 

Using the guidance provided herein, the 
reagents and methods (enzymatic or otherwise) of label 
practicing tfie invention will be iq)parent to those skilled 
Chemistry of Nucleosides and Nucleotides 3, Townsend, 
York, at chpt 4, Gordon, S. The Synthesis and Chemist y 
Benzamidizole Nucleosides and Nucleotides (1994); Gei i 
Nucleosides nnfl NirelffA*^*^ ^ Townsend, L.B. ed., 
can be made by methods similar to tose set forth in 
Nucteoddes 1 Townsend, L.B. ed.. Plenum Press, New 
''The Synthesis and Chemistry of Imidazole and 
Nucleotides (1994); Lopez-Canovas, L. Et al.. Arch. 
X., et aL. Cytometry 20: 172-180 (1995); Boultwood, J. 
(1986); Traincard, et aL^Ann ImmunoLl340: 399-405 (^983). 



Immum L 1340: 399-405 (1983); and 



1 Benzan idizole 



Met 



botin 



Example 11 

Biodn<Aem Link (BoehritigerMannkeim} 

The labeling density is siqspose to be 1 
Cootdinative, non-covalem binding of Biotin*chem-Lin|c 
guanoane involves 1 ug RNA or DN A -i- 1 ul 

minutes 



ana^gs corresponding to all of 
been described in the 
Attach a linker to the base, 
a triphoq3hate amine for 
out using conunercially 

non-2'-deoxyribose- 
23c and sugar-modified 

n lethods for the synthesis of 
ncozporation usefiil in 
in the art See, for example* 
L3. ed.. Plenum Press, New 
of Imidazole aiKl 
Chem. Chemistry of 

Press, New York (1994); 
nf Nucleosides and 
fork, at chpt 4, (jonkm. S. 
Nucleosides and 
/Je5 25: 189-192 fyPW;; Li, 
Et al.,/. Pathol. 148: 61 ff. 



Pleium 



BX 



in per 10 bases, 
to N7 of adenosine and 
in20ulvol.85''Cfor30 
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RNA iabeUng experiment (4 sefr of 4 paoUd UNA tra veripis) 

Very poor labeling and/or hybridization 
very weak). Samples may have been lost after labeling when 
to remove unincorporated label. RNA was 
that this should not be a piobem (BM tech hetp). 



fragmented after labeling. 



BCL labelmg o/dsDNA 

Low signal, background across the entii^ chip. No discrimination. 



reactions 



Fast-Tag (Vector Labs) (RNA) 

Should get 1 biotia per 10-20 bases. Five 

a) RNA1+RNA2+RNA3 (5 pmoles each, total of 5.2 

b) RNA1+RNA2+RNA3 (9 pmoles each, total of 9.4 

c) RNA1+RNA2+RNA3 (18 jmioles each, total of 19 

d) RNA4+RNA5+RNA6 (10 pmoles each, total of 8.7 

e) RNA7+RNA8+RNA9 (10 pmoles each, total of 1 1 .4 
The heat method was used to link S-S to RNA. The 
signal than same targets labeled by IVT method 



Example 12 

RNA Ugase/buMi6 end IabeUng 

This experiment generaUy involved the 
frag ment e d ; b) RNA fragments were 5* phosphorylatec 
kinase/ATP; and c) The 5' end of the RNA is Ugated tc 
RNA ligase. This is illustrated by the following fomiul i: 
5' biotin-AAAAAA-OH 3' + 5' P-RNA-OH 3' = 

Previously this technique was used to 
was hybridized to uiqsackaged chips (high density oli| 
slides) in a 10 ul volume. Lack of mixing was a 
low hybridization intensities, /n vi/ro transcription 
conditions gave 10 X higher signal than bio-A6/RNA 



(cant see 5 pM at all, 20 pM is 
mimcon-lOOs was used 
It is believed 



were run: 
+ 25 uI Fast Tag reagent 
+ 25 ul Fast Tag reagem 
+ 40 ul Fast Tsig reagem 
I ig) + 25 ul Fast Tag reagent 
ug) + 25 ul Fast Tag reagent 
: 20 X lower hybridization 



"5) 
us) 



result: 



IpUowing steps: a). RNA was 
with polynucleotide 
the 3* end of BioA6 using 



5'bioAAAAAA-RNA 3' 
lalkl total cellular mRNA which 
iligobudeotide arrays) (on 2x3 
signifi( ant problem and resulted in 
labeled RNA under these 
Ltgase labeled target 
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In other expenuiients» 3 diffinent ratios o4bio-A6: 



1) Ix bioA6 "=0.5 nmoles biottn-Afi 

2) 2x bioA6; and 

3) 4xBioA6. 

After labeling, the sample was sjnm Enough a microcon- 
remove enzymes and dilute out buffer components. 

Bio-A6 labeled taiget hybridized to chips 
azrays) gave approximately the same hyb. intensity as in 
labeled target 

Staining was for IS minutes with PE at 
higher signal or background was seen with 4x as much b 

For these taqpatimcolts, BioA6: (5* biotinl-AAAAAA 
ordered from Genset. 



3Z and microcon-3 to 

(high density oligonucleotide 
/tiro transcription (IVT) 

n4nnal cone No significantly 
oA6 per ug RN A. 

RNA)w8S 



vect>r 



antiscnse transcripts. 



Example 13 
Pr^ara^n of Gene^pe^ic 
Template DNA preparation 

Linearization of vector: 

If the gene is not already cloned in a 
polymerase promoter sites flanking the insert, see PGR 
The vector is linearized with an cxayme 
insert for sense transcripts, or at the 5' end for 
sequence was checked to verify that the RE does notcu 
embodiment, aa restriction enzyme was chosen that 
ends. 

Following linearizatioii, an aliquot of thi : 
to uncut vector) to verify complete digestion. 

The sample is optionally treated with 
50 020 min - 1 hour to remove enzyme or residual 
protocols). 



tdocs 



RHA were used: 
per I ug RNA); 



with T3 and T7 RNA 
mplification below, 
hat cuts at the 3' end of the 

The insert 
internally. In a preferred 
not produce 3' protruding 



sample is run on a gel (next 



Pij)teinase K (100-200 ug/ml) at 
ases (used in plasmid minipi^ 
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The linearized DNA is purified DNA 

and ethanoi precipitatioa or 3-4 
below). 



phenol/chlorofonn extraction 

(sec 



rounds of microcoa-H K) concentradon/redilution | 



jwtK loters. 



Amilification is only preferred if the dekired region of the gene is not 
already in a cloning vector with RNA polymerase 

Starting with genomic DNA (or cDNA] , 
(or region of the gene represented on 
polymerase promoter sequences and 3' gene-specific 

The following 5' sequence has worked 
bases added to the 3* end). 



1 the chip) using P ::R primers 



amplify the ORF of interest 
with5'T3/nRNA 

sequences. 

1 veil (with 19-21 gene-specific 



5*-GAATTOTAATACGACTCACTATAG0GAGG-[H 19-21 gene-spedfic basest' 



not part of the promoter 
FVT efficiency. 



The S' end consists of: 

a) six 5* flanking bases of your cho(ce 
sequence, but necessary for 

b) 17 bases of the core T7 

c) 1st 6 bases transcribed (sequenc^ of +1 to -^6 can affect 
efficiency) 

The other PGR primer would then contain the T3 
sequence at the 5* end. ThefoUowingsequencehas wijrked well: 



25 5'-AGATGCAATTAACCCTCACTAAA0GGA0A-< 



The 5' end consists of: 

a) six 5* flanking bases (sequence 

b) 17 bases of core T3 RNA PolyrajeFBse 

c) +1 to '^6 transcribed bases 



RNA po ymeiase promoter sequence 



RNA polymerase promoter 



19-21 gene^specific bases)-3' 



t vary fiom this exanqtle) 
» promoter sequence 



W097/r7317 



PCT/I)S97/Q1603 



10 



15 



20 



25 



30 



Amplify the desired sequence usmg 



5 cycles at the annealing temp, best suited for the gene 
alone (typically followed by 25 cycles with 

products on an agarose gel (3-5 ul of a too Mlixn). li 'is 

this stage. 



Stan lard PCR conditions with 1st 
! pecific part of tiie primers 
annealing at 70*'C Check PCR 
not neccssaiy to quantity at 



Ihisi 



OgOumal PraUinase K treatment: 

Add 1 ul of Proteinase K (20 mgM) (A^bi 
PCR reaction and incubate 20 min to 1 h at 50-60°C, 
but if the ih vitro transcription (IVT) products appear 
I»oduct included in the kit (described later) is fiill 
prior to the microcon-lOO and IVT. 



\ion) to the remainder of the 
is usually not necessary, 
d|gnided v/tdlt the control IVT 
then tMs step may be added 



lleng^^ 



Microeoti SC/lOO purffkaium 

Otiber purification methods are beirig 
be subsitiitBdf(Mrmicroi^50 purification. CAUTION 
flow-through poiticHis. 

Add 380 Ml RNase-free water to the 

usinga 

nticroGOO-lOO or microcon*50 as suggested in 
the dilution and concentration 2-3 times. Usefinal 
100 Ml. 



hichi 



In vitro transcription iabeiing wUk biotiu 

For nuudmum yield use Ambion's T3 
Megascript system (thwr proprietary buffer allows 
without inhibiting the polymerase). (Read Ambion 
book!). 

Perform IVT as suggested, but with (1 
andUTP. Do not interchange T3 and T7 lOx 

Megascript kits 

For example, make a NTP mix for 4 



Etfaanol precipttation can 
Microoons may leak. Save all 

PCk product and concentrate 

jastruGticms(Aiiucon). Repeat 
coi centrated sample should be 5- 



(j^l338)orT7(#1334) 
icr nucleotide concentrations 
and suggestions in kit 



initrucuonsi 



3) Uotinylatediunlabeied OTP 
nudeot des that come with the 

r (TT-labeling reactions as follows: 
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818) 

814) 



il fig) 



r react ton. 



Higher concentrations of 
yields. Final iNTP 



8 \il Ambion's T7 lOx ATP [75 mM] 
8 ^1 Ambiorfs T7 lOx OTP [75 mM] 
6|ilAmbioa*sT7 IQxCTP [75 mM] 
6 \>1 Ambion'sT? lOxUTP pS mM] 
5 15 ^1 Bio-I l-CTP [10 mM] (ENZO M 

1 5 p\ Bio-16-UTP [10 mM] (ENZO M. 
For each IVT-labeling reaction, add (at room temp. - nhit on ice): 
14.5 ul NT? mix 

2.0 ul lOx T7 txanscription bufier (Ambton) 
10 *13ulpurifiedPCRproduct(notmore|than 
2.0 111 I Ox T7 enzyme mix (Ambion) 
*Do NOT add more than 1 ug of DNA to the I VT 
ONA actually inhibit the reaction and result m LOWEI L 
composition: 
15 7.5 mM ATP 

7.5 mM OTP 

5.625 mM cold UTP/1.875 mM bio-UTt^ 
5.625 mM cold CTP/1.875 mM bio-CTI ' 
Incubate 4-6 hours at SV'^C. Shorter incubation times i uy be sufficient for some 
20 tzanseiipts or when maximum yield is not important 

OpHonat DNase I trtatamtni 

Add 1 KlRNase-freeDNasel (provided {with Ambion kit) to each 
reaction and mix well. Incubate 15-20 min. at 37X. 

25 

OpHontd ' Froieinase K treatmeni 

This step may help reduce background caused by nonspecific protein 
Irinding to chip and to Strqtavidin-phycoeiythiin: 

Add RNase-firee water to rVT reactions io a final voluoie of 99 ul. 
30 Add 1 ul of Ambion's 20 mg/ml proteins se K. 

Incubate at 50*'C 20-30 min. 
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MIcroeoH puriJicaHon 

Several other puzification methods have lleen 
sufficiently remove rNTPs or had low yields. A|»otoo(l 
purification (Arcbana Nair) kwks very piomisiiig and 
microooD purification. 

Note: Set aside an aliquot of the IVT 
purification. Setting aside 1 % will enable trouble shootifig of this step 



wU 



real txon 



I. 



: water to sampi B and 



Add 400 ul DEPC water to 
microconSOor 100(as 
FLOW-THROUGH FRACTIONB. 
Repeat dilution/concentration 3-^ 
lO-IOOfU. 



suggested by Amicon). 



See comments below. 

Ckeek IVTproduct(s) ona^ 

UsuaUy it is sufficient to check -O.Ol-l"/ 
nondenaturing agarose/TBE gel. Samples are heated to 
electiophoiesis. A single band dose to the expected si 
If there is enough space on the gel, run 2 or 3 different 
and purified IVT products on a gel (~ 0.01%. 0.1% and 
stained with Sybr Green U (FMC) at a 1 : 1 0,000 dUutioji 
sensitive than ethidium bromide). 

If precise detBomination of transcript 
can be run with biotiiiylated RNA standards (available 



Quanta transcript yidd by A^f 

Ejqwct 75-150 |ig RNA per 1 ng startin|s 

quantitation of purified transcript, about 1% of the 
water (or TE) into a M volume of 60-70 ul (for a mibocuvettc) 
absorbance readings within the accurate range (0. 1-1 (pD). 



tested - many did not 
for Carboxy bead-based 
soon be used in place of 



before further 

if necessary, 
concentrate sample with 
I. SAVE ALL 



times. Final volume can be 



& of the reaction on a 
65°C for 1 5 minutes prior to 

is usually observed, 
liiutions of both the unpurified 
1% of each). Gels can be 
in Ix TBE buffer (more 

it desired, a denaturing gel 
fiom Anbion). 



concentrated 



DNA tenq)late. For 

sample diluted with 
should give 
For accurate pipetting 
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H3 



I sera] 



I measure 



: been sufficien ly removed 



:RNA 



volumes (> 1 ^]), it is usually necessary to make a 
make a I /1 0 dilution of your RNA sample, then 
ul final voL)' Always be sure to take a blank reading 
the same bufTerAvater that the RNA sample is diluted 

Since accurate quantitation of pure traifscript 
meaningful spiking experiments, extra care should be 
nucleotides from the TVT reaction have 
contiibutiog to the A^ep* 

The microcon flow thnnigh should be saved and ched^ed 
absoibancc is present in the last flow through, the 
additional rounds of dilution and concentration until 
detected at 260 nm. 

Since microcon iiltraticm 
save all flow- through fractions. Ifthe transcript RNA 
retained/collected sample is mudi lower than predicte \ 
can be rr-concentiated using a fiesh cartridge (then diluted 
4 times). 

Example 14 
LabeUng Toiai mRNA from 

Staitittg material: Good quality poly A 
1 0' cells *(0. 1 ug-5ug poly A+). It is more economica 
RNA (up to 5 ^g), but if material is limited, as little a2 
sufficient quantity of labeled RNA target (10 ^ug afte 



dilution first (for example, 
10% of the dilution in 60-70 
n the same cuvette and using 
nto. 

is essential for 
taken to verify that excess 
and are not 



I devices occas sionaly 



Bouhle Stranded cDNA Synthesis: 

This protocol is a supplement to instructions 
Superscript CHioice System. Before proceeding read tt b 
BRL's Superscript Choice System for cDN A Synthea s 
seq n ence (below) for prisuQg the revase transcriptioi -first 
instead of the oligo(dT) or random primers provided «riih 



forA26o. Ifsignificam 
should be subjected to 
significant absorbance is 



leak, it is advisable to 
concentration in the 
the flow^through fractions 
and reconccntrated at least 



Cells/Tillies 



RNA from at least 5 x lO'-l x 
to start with more poly A+ 
0. 1 ng of polyC A)+ can yield a 
IVT labeling/amplification). 



provided in Gibco BRLOs 
Gibco iKotocol. Follow Gibco 
except use the T7<r)24 
strand cDNA synthesis 
the kit 
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5'<KKX:AGTGAATrGTAATAqGACTCACTATAGGGAG 



First Strand Syntkesa 

Use 0.1 jig-5 »ig Poly (A)*RNA and adjus 

as indicated in the BRLinstnictions. For example: 
3 ^l DEPC-water 
4.5jil(l ^g/»a)mRNA 
1 (100 pmol/ul) T7-(T)24 1»inwir 



[ninutes. 



Mix/Spin/Incubate at 70"C for 1 0 
Chill on ice. 

Add the following components (oi i ice) to the RNA/primer 



mix: 



4 ^1 of 5X 1 St strand cDN A buffei 

2^10.1 MDTT 

1 (&1 [lOmM]dNTPmix 

4. Incubate at 37**C for 2 minutes. 

5. Add 4.5 id Superscript n reverse 
ul SSII RT per ug RNA). For <1 

6. Incubate for 1 hour at 3T*C. 
Final Reaction CompodtioD (20 |il vol.): 

SOmMTris-HCUpHSJ 
75 mM KCl 
3 mM MgCl, 
lOmMDTT 

500 uM each: dATP, dCTP, dGTP, dTT^ 

100 pmol T7-(T)h P™" 
4.5 ug mRNA 

900 U RT (200 U per |ig mRNA) 



PCTAJS97/D1603 



amount of H^O and enzs^me 



t ranscriptase/mix well. Use ( I 
igRNA«tiselulRT. 



Second StrwdSynihais 
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1 . Place first stzand reactions on ice (qiiickly spin down). 

2. Add: 

95filDEPC.HjO 
30 |U 5x Second Strand Buffer 

3 111 [lOmMJdNTPmix 
I »i] [10 U/)il] Ecoli DNA Liga^ 

4 ^1 [10 U/^l] E coli DNA Po])|merase 1 
l^l[2U/^l]RNaseH 

Final Composition (150 (U): 

25mMTris.Ha.pH7.5 
lOOmMKCl 
SmMMgCl, 
lOmMCNHASO^ 
OJSmMb-NAD-*- 

250 jiM each: dATP. dCTP, dG^P. dTTP 
1.2niMDTt 
65UAnlDNAUgase 
250 U/ml DNA Polymerase I 
I3U/mlRNBseH 

3. Mix/spin down/ incubate al 16°C for i hours. 

4. Add 2 Hi [10 U] T4 DNA Polymerase 

5. Incubate 5 min. at ]6*C. 

6. Add 10 yX 0.5 M EDTA/store at -20**ct. 

CLEANUP 

Phemol/ckhrofcnn extraction 

Optional:To reduce sample loss during e^draction, see die PLC 

protocol bdow 

1. Add an equal volume (162 ul) of (25:24:1) 

Phenokchloioformnsoamyl alcoh >i (saturated with 10 mM 
Tris-HCl pH 8.0/lxnM EDTA - S gma). 
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Vortex/^ S mimnes @ 1 4000 x ^ Tiansfer aqueous phase to 
afresh 1.5 ml tube. 



FLG-^MoUChloToform Extraelion 

Phase Lock Qels (PLG)* fonn an inert 
aqueous and ofganic phases of phenotcfalMofonn 
allows more complete recovery of the sample (aqueous 
inter&ce contaminatioii of the sample. PLGds are sold 
ml tubes, to which the user directly adds sample and 



sealed barrier between the 
exUBcfions. The solid barrier 

p lase) and minimizes 
premeasured aliquots in 1.5 
pheJiol-chlorofonxL 



1. 



2. 
3. 



of (25:24; 



(sa ttuated ^ 



Pellet the Phase Lock Gd (1 .5 ml 
microcentrifuge for 20-30 seconds 
worfc, but we haven't specifically 
Transfer the entire (1 62 |il) cDNA 
Add an equal volume (162 ^1) 
cUofiofbrm: isoamyl aldiohol 
ph 8.0/1 mMEDTA-Sigma). 
Mix by inverting (DO NOT VORfnEX). 
part of the suspension, 
xg or greater) for 2 min. 
Transfisr the aqueous upper phase] to 
PLG lis available fiom 5 Prinie-3 Prime, Inc.. cat 
188233 for 200 



ube with PLG I -light.) in a 
[PLG I-heavy should also 
ested it for this application], 
sample to the PLG tube. 
1) Phenol: 

widilOmMTris-HCL 



Microcenirifuge 



5. 



tesed 



Mlcroam-SOPyr^ealhn 

Other purification methods are being 
be subsituted for micron-50 purification. CAUTION: 

flow-through portions. 

1. Add300ulofSmMTrispH 

2. Concentrate by spinning tfarougl 
(Microcon-50 columns, Amicon 
directions supplied by Amicon. 



7.5 to 



PLG will not become 
at full ^)eed (12,000 



a fresh 1.5 ml tube. 
175850 for 50 or#pl- 



i. Ethanol precipitation can 
^icrocons may leak. Save aU 



sample, 
a MicrocoD-50 column 
part #42416) following 
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3. 



vjapeax diludon/coticentration 



3-4 



flow through in case of column 
Concentrate to a final volume of S-10 ul if possible, 
cartridge to spin to diyness. Collect upper volume. 



times, collect and set aside 
failure, 
taking care not to allow the 



In Vitro Tnmser^tkm iMbOing wUh Biatin 

For maximum yield use Ambion's T3 

Megascript System (their proprictaiy buffer allows 

without inhibiting the polymerase). 

Petfom IVT as suggested, but with (1 

andUTP. Do not interchange T3 and T7 1 OX 

Megascript System. Read the Ambion detailed ii 

proceeding. 



(1^338) or T7 (#1334) 
hif her nucleotide concentrations 



nucleotides 
instruct tons 



f ) biotinylated:unlabeled OTP 
that come with the 
and suggestions before 



NTPLabelingMix 

To make NTP labeling mix for 4 IVT>l^beling reactions combine: 
8 |U Ambiotfs T7 lOx ATP [75 nM] 
8 ^1 Ambiim's T7 IQx OTP [75 nM] 
6 |U Ambion's T7 IQx CTP [75 1 oM] 
6|ilAmbion'sT7 lOx UTP (75 inM] 
15 111 Bio-1 l-CTP [10 mM) (Eh ZO #42818) 
15 ^1 Bio-16.UTP [10 mM] (Eh ZO #42814) 



IVTReaetioH 



not on ice 



For each reaction, combine the f allowing at room temperature. 



14.5 (il NTP labeling mix 

2.0 ^1 lOx T7 transcription bufKlr 
*1.5^dscDNA (O.M ugiso|timal 
2.0 |il lOx T7 enzyme mix (Ami ion) 



(Ambion) 

see note below!) 
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*Do NOT add more than 1 |ig of ds cDNA to the IVT re^on. 
of DN A actually inhibit the reaction and result in LOWE|l 
Final rNTP Composition: 

7.SmMATP 

7.5 mM OTP 

5.625 mM cold UTP/1 .875 mM 
5.625 mM cold CTP/1.875 mM 

2. Incubate 4-6 hours 313700. (Shor 
sufBcient for some transcripts or 
important). 

3. Store unused NTP labeling mix at 



bi>-UTP 
bil>-CTP 

[er incubation times may be 
vfhen maximum yield is not 



20'C. 



CLEANUP 

OpHoaaiDNAse 1 Tmtmmt 

1. Add 1 ul RNase-fire DNasel (pro|ided with Ambion kit) to 
each reaction and mix v«ll. 

2. Incubate 15-20 min. at 37''C. 

Optional PnUiiuae K Treatment 

This treatment may help reduce backgrou nd caused by nonspecific 
protein binding to chip and to Strepavidin-phycoeiythrii . 

I. Add RNase-free water to IVT reasons to a final volume of 99 

Hi. 

1. Add 1 \i\ of Ambion's 20 mg/ml Proteinase K. 

3. Incubate at 50''C 20-30 minutes. 



Microcan Fur^icaiion 

Several other purification methods have 
sufficiently remove rNTPs or had low yields, A protocj)! 
purification (Archana Nair) looks very promising and 
miciocon puiificatioiL Set aside an aliquot of the IVT 
purification. Setting a»de 1% will enable trouble 



Higher concentrations 
yields. 



jecn tested - many did not 
for Carboxy bead-based 
soon be used in place of 
1 eaction before further 
shoot ng of this step if necessary. 



viU 
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Add 400 ul DEPC water to 
microcon 50 or 1 00 (as suggest^ 
FLOW-THROUGH FRACnOJ^S, 
Repeat dilution/coiicentration 
10-100 ul. 

Since microcon filtration device^ 
advisable to save all flow-throu^ 
concentration in the retai 
than predicted, the flow-through 
concentrated using a fresh colunln 
recoQcentiatBd at least 4 times, 



3 4 times. Final volurac can be 



occasionally leak, it is 
fractions. If transcript RN A 
lected sample is much lower 
fracdons can be re- 
tben diluted and 



Noieson YieU 



15 



rrri 



20 



2S 



30 



1. Starting with 4-5 ug poly (A)*^ 

using 20% of the purified ds cD|IA 
-75 - 125 ug labeled RNA per 

Z Reading-* 1% of the concentratcjd 
TE) into a final volume of 60-7o|ul 
give absnbance data within the 
accurate pipetting volumes (> 1 
make a serial dilution first 
of your RNA sample, then 
70 ul final vohime. Be sure to 
cuvette and use the same buffer/fatcr 
the RNA san^le. 

3. For accurate quantitation of 
taken to verify that excess 
have been sufiSdently removed 

The microcon flow-through should be saved and check^ 
absoibance is present in the last flow through, the RNA 



I miclec tides 



sndi 



e and concentrate sample with 
byAmicon). SAVE ALL 



q>r the ds cDNA synthesis and 
sample for the IVT. expect 

reaction. 

sample diluted with water (or 
(for a niicnx:uvette) should 
I iccurate range (0.1-1 OD). For 
il), it is usually necessary to 
make a I/IO dilution 
10% of die dilution in 60- 
blank readings in the same 
that was used for diluting 



For e icamplc, i 



)ta]» 



labeled RNA, extra care should be 
fiom the rVT reaction 
are not contributing to the 



forA,^ If significant 
should be subjected to 
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additional rounds of dilution and concodration until no significant absoibance is 
detected at 260 nm. 



Check unfragmented samples on gel 

Electrophoiese the labeled RNA before 
size distribution of labeled transcripts. Samplescan 
and electiophorcscd on agaiose/TBE gels to get an 
size range. If there is enough ^»ce on the gel, nm 2 or 3 
unpurified and purified IVT pioducts on a gel (*- 0.01%, 
can be stained with Sybr Gffcen n (FMC) at a 1 : 10,000 dilution 
(more sensitive than ethidium bromide). 

Alternatively, for more accurate estimatio|is 
the RNA population pre and post fingmrntation, 
^iiMrtiirifig gel using biotinylated RNA molecular weight maricers (Ambion). 



ibehiated 



lapproidmate 



ion to observe the 
to65''Cfor 15 minutes 
idea of the transcript 
UfTerent dilutions of both the 
1% and 1% of each). Gels 
hi IxTBE buffer 



Psottden 



lyophi ized 
Labe ing 

I actually cheaper (per nmok) when bought 



Example 15 
Direct labeiing ofDNA with 
The psoralen-biotin reagent comes 
separately or as part of ^'RadrFree Univeisal Oligo 
(Schleicher ASchuell). It is 
you might as well get the extra kit components and save 
Universal Oligo Labeling and Hybridization kit catalog 
nmoles of Psoralen-biotin). The same kit with UV Long 
#4g3124.. 

1. Spin down then resuspend the 
reagent in either 

a) HulofDMFifyoumay 
or oligonucleotides with 
be more concentrated) OF 

b) 56ulofDMFij 
iiragmentatioiL 



of the size distribution of 
fcse samples through a 



and can be bought 
and Hybridization Kit** 
with the kit so 
money. TheRad-Free 
1^483122 (contains 20 
wave 365 nm lamp: 



lyo ^lized psoraien-biotin 



abel fragmented DNA/RN A 
of the reagent (it needs to 



if you will I lefinitely be labeling before 
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Labeling has been performed 

vnUi similar results, but it is 
because it can't be labeled in 
Adjust the RNA/DNA 



b(thl 



1 becon le 



5 for 10 tig of DNAX less than 

(pH 2.5-1 0) so you can just use 
lesuspend or dilute the 
to be linearized. 
If RN A/DNA is in high salt, it 

10 using the i^jpropriate size 

for fragmented material but 
3. Boil sample 1 0 minyquick chill 
[important - ds DNA wiU 
strands are not separated 

15 4. In dim light add 1 ul oj 

DNA/RN A solution (lul 
in 56 ul DMF per ug DN A/RN/f). 
resu^pended in 14 uU dihite the 
labeling 1:3 in DMF (1 ul cone 

20 5. Transfer solution to into a well 

(uptol50ulAweU). 
6. Place 365 nmUV 

source is about 2 cm from the 
one hour. 

25 7. Transfer samples to 

of HjO-sanuated n-butanol to 
biotin. vortex/centrifuge 1 mirt 
Discard butanol (top layer). 
Fragment as you would luxmaU y. 
30 hybridization (10 min 99-1 OO''^). 

* longer UV irradiation does not inqnove results. 



before and after fragmenting 
to do before fragmentation 
salt(>20mM). 
to 0.5ug^I0ul (200ul 
20inMsalt. pH does not matter 
;terile or DEPCed water to 
RNA/DHAinto. Plasmid DNA needs 



hij{h 



concentr inon 



I an be diluted and concentrated 
microcon 3 works 
-70 min per cycle), 
on ice (store on ice 5 min-3hrs) 
cioss-linked by reagent if 



of mil TOOon (even r 



t takes 



before labeling] 

if ps(»aleri -biotin reagent per 20 1 



psoral »>biotin t 



) mscrocentri [ug( 



8. 
9. 



ulof 

that was resu^)ended 
*if Psoralen-biotin was 
unount you will need for 
psoralen-biotin 3 ul DMF) 
< >f a 96-microwel] plate on ice 



lamp dnectly on top of plate 



simple. 



so that light 
Irraifiate samples for 



;e tubes and add 2 volumes 
etttact unincorporated psoralen 



Roeat 



extraction. 
Denature as normal before 
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* adding more psoralen-biotin per ug DNA/RNA does do 



Example 16 
Psoralen-Biodn Labeling Exp}fnments 
LMmgHNA by standard protocol 

Pool of 4 difif. fiagmented RNA transcript^ 

biotin 

Results ofhybridization to chip (5 pM each X PBiabelel 
appioximately ~5x lower intensities than IVT(bio-U+C) 



labeled with psoralen- 

targets showed 
labeled targets 



MMbeiing b^ore vs. after fragmentaiion: 

No significant difiference in hybridization 



MaOo ofpsondtih-biotin to RNA 

Labeling with a 4x higher ratio of PB : 
afTea hybridization intensities on chips. 



R] 4 A does not significantly 



Time oflabeOngreaetion/uv lamp iateasity 

No significant difference between 1 vs, 3 
mW/cml (Affy lamp) vs. 5-7 mW/cm2 (S&S lamp) intensity 



PsoraUn-bMn 



notin via 14-atom linker aim. 



Psoralens: planar, tricyclic compounds 
Psoralen-biotin: psoralen conjugated to 

High a£Bnity for nucleic adds 

Inteicalates into DNA/RNA 

Becomes covalently attached wh|cn irradiated with long wave 
UV light 

Exan^lel? 
TemUnal Tranrferase End^Labt 'ling Protocol 
This protocol is tested i 



chips.) 
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seem to improve results. 



intensities 



hr. labeling or 15-20 
at 365 nnu 



and optimized tfa mughly with only PRT 440S 



wo 97^7317 



10 



IS 



20 



25 



30 



DSAse fragmentation 



This will have enough for 4 labeling ixeaction i: 
4 pmol of HIV PCR target (3.l7ug ofip 
DNAse(BRL) 

Calf Alkaline Phosphatase, lUAil (BR4) 
Dilution CAP Buffer (BRL) 
MgCI, 

Bring 1^ with H20 to lOOul 
ST'CforlSmin. 
95'CforlOmin. 
4'*Conhold. 



F-ddUTP are comparable 



3mol) 
5mM) 



(2 

(l)uM) 



TdTLabelmg 

F-N6-ddATP, FkWATP, F-ddCTP, and 
labeled in the reaction. We decided to use F-N6-ddAl|p. 

Fragment DNA sample 2Sui (1 
5XTdT Buffer (Boduinger) 20ul 
25mM CoC12 (Boehringer) 1 Oul 
F^6-ddATP(lmM) lul 
tdT(2SU/ul) (Boehringer) lul 
HjO 43ul 

37Xfor30nmL 

9S**Cfor Smin. 

4''Conhold. 



PRTUQS Hyhndizption (Rela Station) 

Labeled san4)lc IClOul 

lOX SSPE; 0.1% Triton X-100 3Q Oul 

Control (lOQnM) 213 Oligos 5ut 

HjO islsul 
45«CHybfor30min. 

20"C Wash with 6X SSPE, 0.005% Triton X-ljX); 4 cycles / 10 drain-fill. 
Scan chq) at 530nni. 1 1.25am pixel size. 
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kb insert) Xul 

Xul(lUAig) 
2.5ul(25U/rx) 
Z5ul 

Xul(1.25mM} 



T " 
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Example IS 



'Procei fares 



Aliemaie Labeling 

Ugaiion assay 

RN A can be directly labded by Ugatiiig 
withbiotinattheS'eiidwithRNAiigase. Cre,8 
T7 RNA polymerase to genetate an antisense RNA 
kinased v^th olynucleootide kinase to generate S' phospjwiylaied 
RNA was then ligated using T4 RNA ligase. 5pmol 
expfcssion chips along with ttie labeled Cre. 



ibacteriaJ 



f liCated 



nirtetiabeimgofS'RNA asingPofy A polymerase 

Poly A polymerse has been used to 
hydroxyi terminus of RNA utilizing ATP as a precursor 
JoMycong Kim et ai. (1995) NucL Acids Res., 23(12); 
successfully used poly A polymerase to tail 3' RNA 
used to label fragmenied RNA with biotin CTP to 

The advantage of this method is that 
directly labeled by Inotin CTP. Antisense RNA can 
fragmentation. The consumption of CTP can be cut 
I VT reaction. 



wit I 



geneaate 



Exmt^iel9 
Direct labeling Prouipol 
Reagents for direct iabeimg mRNA 

1) 100 >iMrATP 200^1 

198mLDEPCH20 
2pL(10mM)rATP 

2) 100^g/MBSA 

NEBAcelylatedBSA 

3) 30mMDTr 

4) 1 0 U/|iL polynucleotide kinase 

Bodiringer Mannheim 3' phosphatase iec cat # 83S29 
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A6 RNA oligonucleotide 
gene, was transcribed with 
RNA was fragmented and 

ends. The Biotin A6 
RNA was tested on gene 



catalyze poly A tail on to the free 3' 
Recently, it was reported by 
2245-2251. that they 
CTP. This method can be 

labeled target 
RNA (mRNA) can be 
be labeled after 
by l/5th compared to an 



als> 
do vn 
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iSS' 



H,0(l^g) 

$u£fer 



5) 1 mnolc/^iL BioA6 

Genetics Institute 

6) 5U/MLT4RNAUgase+ 10 XT4RNA Lipase Buffer 

EinccDtie Technologies, catalogue # Lks025 
5 7) 5 X RNA Fragmentation Bitfifer 

200 mM Tris-Acetate, pH 8.1 
SOOmMKOAc 
ISOmMMgOAc 
IHnei LabeUmg Pnooeol 
10 Fngmeniadoti 

Add to a t.S ml sterile tube 

8 mL poly (A)* RNA in DEPC-l 
2pL5X RNA Fiagmentation 
Heat to 94«C for 35 minutes. 
15 Kinase Reaction 

Add to the 10 (iL fiagmented RNA: 
2.4MLLrATP(100|iM) 
2)iLBSA(100Mg/ml) 
2MLLDTT(30mM) 
20 l.6^DEPC.H20 

2 polynucleotide kinase ( 1 0 p/^L) 
Incubate at 37 "C for 2.S hours. Heat 

en^me). 

T4RNA iJ^ase Reaction 
25 Addtothe20|iLkinasedRNA: 

0.5 BioA6 (1 mIlote^^L in ClEPC-HjO) 
3>iLrATP (19 mM) 

3 nL 10 X T4 RNA Ugase buff^ 
0.5 |iL DEPC-H^O 

30 ]7"C overnight - 2 di^s. 94"C for 2 minutes. 
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t > 94X for 2 minutes (heat kill 



Example 20 

Computer Algorithms to Perform Basecu Uing on a Target DNA 
Satnple Hybritlized or LigaUti to Ga teric DNA Arrays. 
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25 
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tof & 



ta fget 



Resequencing a DNA target by generatittg a sd 'ofn electronic 
iMuer generic DNA arrt^ 

This method of resequencing the target is 

with customised resequoicmg GeneChips except that iml}ke 
which physically place a single series of tiling probes on 
GeneChip a computer elecaonicaUy reconstructs a set 
the appro p ri ate probe information fiom the generic anay 
possible n-mer sequences). In general, to resequence a 
decooqxjsed into an n-mer complement word spectrum o 
tiling probe» there exists a set of '•fiirt 
containing a single base substitudon) on the generic 
higher oider nearest neighbois). This process b termed ti 
sequence with D-mer words (Fig* 24). To make a basecall 
the target, the intensity of the tiling probe al dud position 
intensides of its ''nearest-Deighbors*' at that postdon. 
^^nearest-neighbors** because die single base substitution 
positions widiin the probe. The base substitution at a 
probe that yields the highest intensity is the base called 
probe (Fig. 25), The advantage of using a generic DNA 
GeneChips is the high degree of redundancy achieved 
An n-mer generic arrays makes n base calls for each has ; 
the custom itsequendng GeneChips make only a single 

The final basecall of a target base is 
of the base calls fiom the n differem electronic tilings at 



Tbixe 



5 decit led 



EmperieaUy using the accuracy of the ItasecaUs 
arrays to fitter out inaccuraueleetronic tilings. 

A given reference DNA sample is fayl 
amy. A set of n electronic tilings are generated and th( 
made. A concctness score table is constructed by 
substitution series makes a correct b asecall or a score 
(Fig. 27). A confidence level for a given basecall can 



f givug 
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tiling arrays on an 



\ imilar to the method used 
the custom GeneChips 

he chip, widi a generic 
tilmg amys by fetching 
a generic anay contains a 

DNA . die target is 
tiling probes. For each 
order nearest-neiglibox^ tiling probes (probes 
chip (generic chips also contain 
ing through the target 
at a given position within 
is compared to the 

are n sets of such 
an occur at n different 
pa: ticular position within the 
that position within the 
in^ vs. die standard custom 
basecall of die target 
within the target wtoeas 
basecall. 

upon by an electromc vote 
each target position (Fig. 26). 



foreachl 



derive i from the n electronic tiling 



bricized/Ug 



igaled to a generic DNA 
coirespon dins basecalls 
ascoieof I if a given tiling 
o|r 0 if the basecall is incorrect 
be attached to each scoring 



flsol 



wo 97/2TS17 

aocofding to the ratio of the intensities of the base'^ 
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: generated, excef t this 



A variant DNA sample is then hybridized/ligaied to 
Again a set of n electnmic tilings are 
discarded which have a 0 con c ctu ess score, and 
c oiiectn e ss score of 1 are included in the overall base 
result is to dnunatically improve the overall peroeatagk 



a second generic DNA anay. 
time all tilings are 
which have a 
voting procedure (Fig 28). Hie 
of correct basecalls. 



only 1 hose tilings \ 



Compormg "loatlfy" morm^d tilmg probe intau^ between a nferenee sample 
tmd a variant a a senvtiveme^odtf detecting a i 

For a given n-mer generic array, the abjl 
target decreases as the complexity of the target in 
increases, the number of n-mcr tiling probes which i 
increases, the cross-talk between nearest neighbors at i 
and the overall cross hybridization increases. AUtfiesel 
of the bases within the target The comparison of as 
target provides a powerful way to **filter out** all the i 



Uty to correcdy resequence a 
As the target complexity 
themselves within the target 
I lifierent positions increases, 
fiictors contribute to miscalls 
target against a reference 
ni m-specific noise via diffisrence 



tmcreases. 
rqeait 



sainplet 



"kcal' 



One meduxl of comparison between thi 
compare the intensities of the tiling probes themselves 
comparison can be made, the intensities have to be 
accoum for both chip to chip and sample to sample 
naimaltzatiott process to nonnalize the signals. By 
divide the intensity of the tiling probe by the sum 
neighbors (Fig. 29). 

This method of normalization creates good signal 
quite sensitive to the presence of a mutation indicated 
(Fig. 30). This **local** normalization tiling probe 
transformed by difference analysis and smoothing to 
mntation is more easily visualized. 



Induced Difference method for detecting msttationx. 
Another method for using comparisons 
sample to detect mutations is via mutationai ''induced 
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dtutions for any given basecall. 



reference and sample is to 
However, before a direct 
in some matter to 
1 employed a ""local** 
normalization, I simply 
intensities of its nearest 



noi malizcd 



vaiiation. 



loftfce 



trad ing between samples and is 

Tf the formation of a "bubble** 
comparison can be ftvther 

format where the presence of a 



between a reference and a 
iifierences** between tilings 



T " ■ 
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probes and their nearest neighbois. Application of this 
neighbor tiling analysis involves comparing 
refocace target to the cone^wnding probe tn the samph 
uninfbnnative in pan II, because they miscalled the base , 
because certain probe members within that tiling can be 
decRase in intensity) between the teftrence and the 
a mutation (Fig 31.) lliese inductions are summed 
forward and reverse strand for a given target position, 
I of whether a mutation is present or not (Fig. 32 



**locally cu)r nalized' 



sam>le 



nliethod to a first order nearest 
r probes in the 
target Tilings that where 
may tra w be infonnative 
induced (caused to increase or 
indicating the prese n c e of 
the tilings on both the 
the resultant number is a 
Fig. 33). 



overall 



aid 



5 degenerate 



: addition of t-6 



Example 21 
Use oflnosine on the 5 'ends ofihe 
increase dtq^ex stabUity and increase 

Generic Ugadon Genei^ips. 

We investigated the use of adding 
(pairs with all other bases), to the end of the MenPoc 
duplex stability. We found that indeed, the 
the probes ^ in fact increase the sigpal intensity in 
reactions on a Generic UgatioD GeneChip and allowed 
temperatures. 

Inostnes (0 -6) are placed at the 5* end 
manu&ctuxating, and the effects of these tenninal 
DNAasel digested, TdT labeled 788 \xp DN A fragment 
brightness with 2 -6 inosines i n d t r at rri an enhancemen : 
inosines there is a slight decrease in intensity 
texminal inosines are probably starting to fbnn quartet 



MenPa : synlheswd probes i 



to 

the resultant ligation signal on 



Example 22 
Comparison between the specificity of T4 li^i 
used on a Generic Ligation 

We investigated ^M^iethcr T4 ligase or 
Ugating target to the Generic Ligation GeneChip. In 
perform the ligation reaction at 40 degrees C or higbe r. 



bases, such as inosine 
s>|nthe$izBd probes to increase 
s onto the end of 
boih hybridization and ligation 
ustoligatealhigber 

op the probe during 

are assayed by ligating a 
tothechips. Theincreased 
of diq)lcx stability. With 6 
to 2-4 inosines because the 
like secondary structures. 



ase and Taq ligase when 
GeneChip. 

ligase was more specific in 
( irder use Taq ligase, we need to 
Consequently, we used an 8- 
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10 



15 



mer chip with 6 Inosines at the end of the MenPoc 
stability of the diqilexes. This allowed us to pafom 
degrees C and compare this to a T4 ligation reaction 
indicated that Taq is mtxh more specific than T4 
ends that T4 ligase is unable to ligate. 

Taq lights 19 fewer features but with 
wwiiftating the specificity of Taq versus T4. 

Intensity profiles of the tiling probes 
at given probe positions within the target tUustrate 
and that Taq detects signal intensity at probes that 

It is understood that the examples and 
illustrative purposes only and that various 
will be suggested to persons skilled in the art and an 
and purview of this application and scope ol 
patents, and patent qipUcations cited herein are 
all purposes. 
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pfobes to increase the thennal 
the Taq ligase reaction at 44 
at 37 degrees C. Our results 
and ligates a set of target 



of the appi aided 



iherelry 



; md nearest neighbor substitutions 
Taq is more specific than T4 
feils to detect signal. 

heremare for 
or changes in light thereof 
to be included within the spirit 
claims. Alt publications, 
incQiporatBd by reference for 



tlat 



embod tments described 1 



> modificati ons 



I brighter intensity than T4 does 
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1 comprisu g the 



nideic add levels between 
steps of: 
arrays said anays 



WHAT IS Hf^'^^ i^o 

1. A iDCtbod of identifyiiig differences in 

two or oum nucleic acid samples, said method 

(a) providing one or more oligonucleotide 
^^^TTjiri^ng piobe oUgoaucleotides attached to a surface; 

(b) hybridizing said nucleic acid 
arrays to form hytnid duplexes between nucleic acids in 
probe oligonucleotides in said one or more arrays that an 
nucleic acids or subsequences thereof; 

(c) contacting said one or more an|iys with a nucleic add 

Ugase;and 

(d) determining difFerences in 
nuddc acid samples wherein said differences in 
said nudeic add levels. 



sf mples to sdd one or more 
2 aid nuddc add samples and 
complementary to said 



hyb|idi2ation between said 

indicate differences in 



I hybiidi: ation 



2, TTtt method ofchiiml, further comprising contacting said 
oligonucleotide anays with one or more ligalablc oUgon icleotides. 

3. TTic method of claim 2, wherein said I igatablc oligonucleotides are a 
pool of all possible oUgonudeotides of a preselected les gth. 



oprises 



4. Tbe method of daim 2, wherein said I letenninii 
: Gi more of sdd Ugatable oligonucleotides attached to said array. 



5. The method of claim 1, wherein said 
two arrays and said arrays are essentially the same in 
composition. 



6. The method ofdaim 5, wherein the J paiiai arrangement of said 
piobe oligonucleotides is essentially the same in said a^ys. 



7. The method ofdaim 1, wherein eac^ 
hybridized is to a different anray , the different arrays 
probe oligomideotide composition. 



ane or more airays is at least 
pi obe oligonucleotide 



of said nudeic add samples is 
hkving substantially the s 



wo 97/27317 



10 



15 



20 



25 



/6/ 



8. The method of claim l,wfaaeiotwc 
samples are hybridized to a single oligonucleotide am y. 



9. The method of claim 8, wherein saic nucleic acid saixq)Ies are 
simultaneously IgrbiidizBd to a single oligonucleotide j my. 



10. The method of claim I, wherein sa^ 
pairs of probe oligonucleotides that differ from each 



otiier 



11. The method of claim 10, wherein siid pairs of probe 
oligonucleotides differ from each other in a single nuci eotide. 



12. The me&od of claim 10, wherein s^id 
determining the difference in sample nucleic add hyhr dization 
membcn of said pairs of probe oligonucleotides. 



I samples, said method compii ing the steps 



t probe oligonucleo ides 



13. A OMthod of identifying difference* 
two or mote nucleic acid 

(a) providing one or more 
probe oligonucleotides wherein said 
region and a variable region; 

(b) faybridizizig said nucleic acid 
arrays to form hybrid duplexes between nucleic adds i 
said variable regions that are complementary to said 
thereof aixl 

(c) determining diffbences in 
nuddc acid samples wherein said differences in 
said nucleic add levels. 



30 14. The method of claim 13, wherein sdid variable region varies in 

length frmn about 3 nucleotides to about 50 oligonude >tides. 
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or more of said nucldc acid 



probe oligonucleotides are 
in presdected nucleotides. 



determining comprises 
intensity between the 



in nudeic acid levels between 
of: 

oUgoi|udeotide arrays coniprising 
comprise a oonslam 



samples to said one or more 
said nucldc add samples and 
ni^eic acids or subscqtiences 

hjlfaridization between sdd 
hybric ization indicate differences in 



15. The method of claim 13, wherein tl e variable regions of said 
probe oligomideotides conqirise all possible oUgonud »tidcs of a preselected length. 

35 
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16. Tbe method of claim 15, wherein saic 
5 nucleotides in length. 



17, The method of daim 13, wherein sai(| constant icgion ranges in 
length from 3 nucleotides to about 25 nucleotides. 



18. The method of claim 13. wherein said 
nucleotide sequence complementary to a sense or 
recognition site of a restriction endonuclease. 



19. The method of claim 13, 
oligonucleotide arrays with a constant ol 
region or a subsequence thereof. 



fuifter ccMD pnsing 



ligonucleotide o miplementary 



20. The method ofclaim 19, comprising contacting said array with a 



ligase. 



21. The method ofclaim 19, wbcrein 
detecting a niiddc add of said nucleic add samples 
oUgonudeotide. 



22. The method of claim 13, wherdn 
pairs of probe oUgOTudcotides that dife from each Other 



23. The method of claim 22, wherein 
()£lenniziingtfae difference in 
members of said pairs of probe oUgonudeotides. 



sample nucleic acid hybi idization intensity 



24. A method of identifying differcnas 
two or more nuddc add samples, said method compr ising 

(a) providing one or more arra> s 
comprising pairs of probe oUgonudeotides where the 
oUgonudeotides fiffcr from each other in 

(b) hybridizing said nuddc 
arrays to form hybrid duplexes between nuddc adds 



;acjd 
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variable regions are at least 



constam regions comprise a 
sequence of the 



> contacting said 

f to said constant 



sad 



attiched 



detenmnmg 
to said 



Si id probe oligonucleotides are 



m 



s^d determuuDg comprises 

between the 



in nucleic acid levels between 

the steps of: 
of oUgonudeotides each array 
members of each pair of probe 



preselected nucleotides; 



samples to said one or more 
in said nucldc acid samples and 
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probe oligonucleotides in said one or more'^nnys 
nucleic acids or subsequences thereof, 

(c) detennining Che differeoces 
nucleic acid samples wherein said differences in hyl 
5 said nucleic acid levels. 
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rcr/i)S97A)i60> 
thatjare complementaiy to said 



n hybridizaiion between said 
'biqiization indicate differences in 



25. The method of claim 24, wherein s^d 
probe oligonucleotides dififer from each other in a cenjrally 



26. A method of identifying difference 5 
two or more nucleic add samples, said method compr sing 

(a) providing one or more array \ 
array comprising mom than 100 diifoent probe oligoz ucleotides 

each dilTerem probe oligonuclet itide 
predetennined region of the array; 

each differem probe oligonucleotide as attached to a sui&ce 
through a terminal covalent bond; 

the density of said probe diffete^ 
than about 60 dififerent oiigonucleotidi 

(b) hybridizing said nucleic acii 
arrays to form hybrid duplexes between nucleic acids 
probe oltgomideotides in said one or more arrays that 
nucleic adds or subsequences thereof; 

(c) determining the difTerences 
nucleic acid samples wherein said differences in hybriilization 
said nucldc add levels. 



27. The method of claim 26, further cobiprising contacting said one or 
oligonucleotide armys with a Ugase. 



28. A method of identifyiiig differences 
two or more nuddc add samples, said method compri sing 

(a) providing one or more oligo lucleotide 
comprising probe oligonucleotides wherein said probe 
to hybridize to nucleic adds derived from 



members of each pair of 
located nucleotide. 



inmideic add levels between 

the steps of: 
of oligonucleotide arrays each 
wherein: 
is localized in a 



oligoimcleotides is greater 
per I cm^; 

samples to said one or mm 
1 said nucldc add samples and 
are complementary to said 

hybridization between said 
indicate differences in 



in middc acid levels between 
thestqnof: 
arrays each 
oligonudeotides are not chosen 
ormRNAs; 



particular pi eselected genes 
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(b) hybridizing said nucletc add i 
aitays to fonn hybnd di^lcxcs between nucleic acids in 
piobe oligonucleotides in said one or more arrays that ai 
nucleic adds or subsequences thereof, and 

(d) detcnnixiing dififeicnces in 
nuddc add samples wherein said differences in 
said nuddc add levels. 
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& imptes to said one or more 
aid nuddc add samples and 
complementaiy to sdd 



hyl ridi2Stion 



hybridisatii 



29. The mediodofclaim 28, v»*crrinsai|i probe ol 
paiis of probe oUgonudeotides that differ fiwn eadi oth sr 



sad 



30. The method of claim 29, wheidn 
determining the diffiaencc in sample oucldc acid hybrid 
members of said pairs of probe oligonucleotides. 



determining com|nises 
i( lizacioD intensity between the 



31. A method of identifying differences 
two or more nuddc add samples, said method compri^ 



(a) provi^ng one or more 
compriang probe oUgonudeotides wherein said probe 
nudeotidc sequence or subsequences sde< 
group consisting of a random selection, a haphazard 
composition biased selection, and all possible ol 



(b) hybridizing sdd nuddc ad< 1 
arrays to form hybrid duplexes between nuddc adds 
probe oUgonudeotides in sdd one or more arrays thai 
nuddc acids or subsequences thereof, and 

(c) determining differences in 
Dttdeic acid samples wherein said differences in 
sdd nuddc acid levds. 



32. Tlie method ofdaim 31, wherein 
nnfflefttide subsequences are all possibi c ol 
selected from the group consisting of: dl possible 



between sdd 
ion indicate dififereoces in 



oUgor udeotide 



in nuddc add levds between 
the steps of: 

arrays each 
oUgonudeotides comprise a 
to a process selected from the 
selection, a micleotide 
tligonu ileotides of a preselected 



samples to said one or more 
^ sdd nudeic add samples and 
are complementary to sdd 



] tybridization between sdd 

ion indicate differences in 



hybiidizatii 



sdd micleotide sequence or 
lUgonuclec tides of a iKeselccled lengtii 
6 mers, aU possible 7 mers, dl 
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10 



15 



20 



25 



30 



possible 8 mers, all possible 9 men, all possible 10 1 
possible 12 mets. 



nfers, all possible i I meis, and all 



I pool of taiget nucleic i cids comprisiiig 



33. A method of simultazieously 
multiplicity of genes» said method comprising: 

(a) providing a 
of one or more of said genes, or nucleic acids derived 

(b) hybridizing said pool of mideic 
comprising piobe oligonucleotides immobilized on 

(c) contacting said oligonucleotide 

(d) quantifying the hybridization of sajd 
wherein said quantiiying provides a measuxe of the 



mon toring ibc expression of a 



RNAt 

from said RNA transcripts; 
to an oligonucleotide array 
ajsurfiioe; 

with a ligase; and 
nucleic acids to said aiiay 
of tnnscription of said 



acidst 



:arriy 



levels < 



wherein said 



ibsequeices 



34. The method of claim 33, 
comprise nucleotide sequeces or nucleotide sul 
preselected RNA transcripts of one or more of said 
from said RNA transciipis. 



probe oligonucleotides 
conq;>lementaiy to 
;enes, or nucleic acids derived 



olig mucleotide 



35. A method of simultaneously monjtoring the expression of a 
multiplicity of genes, said method comprising: 

(a) providing one or more 
probe oligonucleotides whexeiD said probe oligonucleotides 
region and a variable region; 

(b) providing a pool of target 
transcripts of one or more of said genes, or nucleic 
transcripts; 

(c) hybridizing said pool of ni^leic acids to said array of 
oligonucleotide probes; and 

(d) quantiiying the fayhridizBtibn of said nucleic acids to said 
arr^ wheieia said quantifyiiv provides a measure o 'the levels of transcription of said 



arrays conqmsmg 
com|mse a constant 



I lUcleic acids cominising RNA 
ai ids derived fiom said RNA 



36. The method of claim 3S, wfaereii] said probe oligoxuideotides 
35 comprise nucleotide sequeces or nucleotide subsequ sices complementary to 



W097AT317 

preselected RN A transcripts of odc or more of said 
fiom said RNA transcripts. 



10 



15 



20 



25 



30 



35 



airay 



ni icteie 



CO! npnse 



37. A method of making a nucleic acid 
differences in nucleic acid levels between two or more 
method comprising the steps of: 

(a) providing an oligonucleotide a ray 
oligonucleotides wherein said probe oligonucleotides 
a variable region; 

(b) hybridizing one or more of saijl 
arrays to form hybrid duplexes of said variable re^on 
acid samples comprising subsequences complementary 

(c) attaching the sample nucleic 
diq)lexes to said array of probe oligonucleotides; and 

(d) removing unattached nucleic 
oligonucleodde anay bearing sanq>le inideic adds attacjied 



lard 



aiids 



1 dds to provide a lug)i density 
to said amy. 



:acid{iray 



38. A method of making a nucleic 
difteiences in ntxleic acid levels between two or mote 
method comprising the steps of: 

(a) providing an array comprisinj ; 
probe oligonucleotides wherein: 

each difTcfent piobe oligonuclecjtide is localized in a 
predetermined r^on of the array; 

each different probe oligonucieo|ide 
through a temiinal covalent bond; 

the density of said probe dififereiU 
thaw about 60 difierem oligonudeotides 

(b) contacting said array one or 
nudeic acid samples whereby nucleic acids of said ont 
add samples form hybrid duplexes with piobe ol 

(c) attadung the sample nudeic 
diqriexes to said axiay of probe oligonucleotides; and 

(d) removing unattached nuclei ; 
oligonucleotide array bearing sample nucleic 



: acids attiched 
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, or nucleic acids derived 



for identifying 
acid samples, said 



comprising probe 
a constant region and 



nuddc add samples to said 
nuddc acids in said nudeic 
said variable region; 
coro]Kising said hybrid 



for identifying 
ilucldc add samples, sdd 



more than 100 different 



\ is attached to a surface 



oligonudeotides is greater 
per 1 cm^; 

nore of said two or more 
of said two or more nuddc 

insaidanqrs; 
adds comprising said hybrid 



lUgom deotides 



adds to provide a high density 
to said anay. 
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39. A kit for identifying diffecences' 



PCT/US97A16Q9 
inbudetc acid levels between two 



or more nucleic acid samples, said kit comprising: 
a container containing one or 

arrays comprising probe oligonucleotides attached to i 
a container containing a ligBse. 



mpre oligonucleotide airays said 
surface; and 



40. A kit for identifying differences in|nucleic acid levels between two 
or moie nucleic acid samples^ said kit comprising: 

a container containing one or n 
arrays comprising probe oligonucleotides wherein sai<| i 
comprise a constant region and a variable region^ 



» further compri sing 



41. The lot of claim 40, 
complementary to said constant region or a subsequenbe 



42. A method of labeling a nucleic aci U said method comprising the 



steps of: 



amplicons; and 
said fragments. 



(a) providing a mideic acid; 

(b) amplifying said nucleic ack 

(c) fiagmenting said amplicons 

(d) coupling a 



43. A method oflabding a nucleic acil, said method comprising the 



(a) providing a nudeic acid; 

(b) transcribing said nucleic ac(d to formed a transcribed 

nucleic acid; 

(c) fiagmenting said transcribe^ nucleic acid to form fragments 
of said transcribed mideic add; and 

(d) coiq>ling a labeled moiety t|> at least one of said fiagments. 



oligonucleotide anays said 
probe oligonucleotides 



a constant oligonucleotide 
thereof. 



to form amplicons; 
to form fragments of said 
labeled moiety to at least one of 



44. A method of labding a nucldc ac d conqirising the steps of: 
(a) providing at least one midc tc acid coupled to asupport; 
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tenninal transferase 



moiety caiiable 



(b) providing a labeled 
to said fiitdeic add; 

(c) providing said tenninal transfi^rase; 

(d) coi^ling said labeled moiety 



5 tenninal tnmsfarase. 



10 



44. A method of labeling a nucleic acid 

(a) providing at least two nucleic 

(b) increasing the number of moi omer 
addsto fonn a common nudete add tail cm said at leas : 

(c) providing a labeled moiety capable 
common nucldc add tails; and 

(d) contacting said common 

moieiy. 



^mprising the steps of: 
acids coupled to a siqipoit; 

units of sud nucldc 
two nuddc adds; 

of leeogoizing sdd 



nuc] eic acid tails and sdd labeled 



IS 



20 



45. A method of labeling a nuddc acid 

(a) providing at least one nucld< 

(b) providing a labeled moiety cppable 
ligase to sdd nuddc add; 

(c) providing sdd ligase; and 

(d) coupling said labded moiet; 

Ugase. 



25 



46. A compound bavnig the fonnula: 



30 



35 



whcidn Rl is hydrogen, hydroxyl, a phosphate 
R2 is hydrogen or hydroxyl; 
R3 is hydrogen, hydroxyl. a phosphate linkage, or a 
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: of bdng coupled with a 



r,and 

\ sdd nucleic acid using sdd 



comprising the steps of: 
add coupled to asuppoit; 
of bdng coupled with a 



to sdd nuddc add using said 



links se, or a phosphate group; 



)hosphate grotq); and 
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10 



15 



20 



25 



30 



R4 is a coupled labeled moiety. 



47. A compound having the fonnula: 




ft. 

wherein Rl is hydrogen, hydroxy!, a phoqshate 
R2 is hydrogen or hydroxyl; 
R3 is hydrogen, hydroxyl, a phosphate linkage, or a 
R4 is a coupled labeled moiety. 



or a phosphate group; 
I phosphate group; and 



comp nsing 



iligmucli 



ird mg 



48. Amethodof identi^ngdifiii 
two or more micleic acid samples, said method 

(a) providing one or more ol 
comprising probe oUgonucleotidei wherein said prode 
nucleotide seqixnce or subsequences selected accoi 
gnM^ consisting of a random selection, a haphazard 
composition biased selection, and all possible 
length; 

(b) providing software describjng 
probe oligonucleotides on said array; 

(c) hybridizing said nucleic 
arrays to form hybrid duplexes between nucleic addj 
probe oligonucleotides in said one or more arrays tha t 
nucleic acids or subsequences thereof; 

(d) operating said software 
difierences in said nucleic acid levels. 



;acd 
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in nucleic acid leveb between 
tlie5tq}Sof: 
ieotideanayseach 
oligomteleoddes comprise a 
to a process selected from the 
election, a nucleotide 



oligon icleotides of a preselected 



the location and sequence of 



samples to said one or more 
in said nucleic acid samples and 
are complementary to said 



sue h that said hybridizing indicate 
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FIG. 6 
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CEIITRAL 



ICSSOR 



|00 



SPEAKER 



nxED 

DISK 



HETTO K 
IMTEBFKE 



FIG. 7 
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npuof 
m of ptffvo maicfi 
prates wim a g«ft« 



Comnnng tiM hyMdization mtemi es 
off the pwfea matcto and mismatc i 
praon Of oacn pair 



a 



W097/273t7 



SCAN DATA 
N pnrs 01 raw and 1^ 



BACKGROUND 8UBTRACT1C3N 
N pan of 1^ and 




1 

1 NPOSat 


POS*l 

1 






NNEG»» 


NEC ♦I 



J c 



DECISION MATRIX based on N. 
NP08.NNEG.andlJR 



3 7V 



Dcctstoo Ca9 
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^79 



COMPUTE 
Averaoe<TO-Ul) 
AveraoeCDfF) for NPOS vxt NNBG 



QUANTITATIVE MEASUnaefr 



F/G.9 
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. B/ISEUNE SCAN DATA 



aACK6ROUI«> 
SUBTRACTION 



L 







DCP 


£RlME^r^scANOATA i 
mornwj^and j_ ; 

M j 




^ ^1 


i 


BACKGROUND 
SUBTRACTION 



NORIIAUZE 



rsandJ's 




T 



1 
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— NINONINC^I 



NOEC • NOEC ♦ 1 



3 30 



COMPUTE 
NPOSB. NNSa. UWMrn 
NPOSe. UW Mr A 



ABSOLUTE DECISION 
ICOMPUTATIOM FOR 



BASCUNE AND I 



DECISION MATPIX FOR 
DIFFERENCE CALLS 



COMPUTE 
Av«fi9e(Upm > Jmm) • (Ipm • If^)) 
Av«rag«Upm - Jmm) / Avmge (ipn - tmm) 



QUANTITATIVE OlFFERENqE 
RESULTS 
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I 



F/G. fOB 
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togMwn 



•PPtndMloom 



Oont.BOt»iiea9n» 



F/G. tt 



aiogtnt 
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Fig. 13t/ 
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Fig. 14a 



5- 13- 

:tc 



Sad site: GAGCTC 

.XXXXXXXXfSAGCT^mXX)!^ 



Sad 



Target 



CXXXXXXXX- 

Sad digested Target 
fragment 



Fig. 14b 



Generic 8-mer 



last base of andior ^ 
compatible with Sad ^ 
digestion 



Chip Andior 
Sequence 




//////////////////// 



Hsp9211site: CAT 



G^ 



I fragment 
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Fig. 14c 



Monitoring mRNA expression from org£ nisms with small genomes: 




- 1 kb genon ic or cONA fragments with 5* C 
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Fig. 15a 




PJ\?J>A. 



first strand i :ONA synthtsts 



Fig. 15b 



cONA 




-'GV 



* random pnmer set with restriction sito 



Fig. 15c random primed PGR of cDNA 




-5nc1 ^ - ^ ^ -'GV^lTTTTTTTTTTTTTT 



poiy(A) tau 



TTTTTTTTTTT" 



^bachor 



primer 



AAAAAAAAAAAAM. 




restriction cut 



;Vf AAAAAAAAAAAAAA. . 
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Fig* 15d RMmctionoigesi PGR products. 




( : Vf AAAAAAAAAAAAAA . 
< ;V J.TTTTTTTTTTTTTT 



Fig. 15e 



Sort fragmants by S' ends on Generic Ligai ion GeneChtp. 



sorted fragmems 
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Intensity differeneai 
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Post-Fragmentation End Labelling- QAPtP Treatment 



2S U TdTsM: 1 nmol Frrc^dlT P 



100.0% 



92.0% 
90.0% 
«8.0% 













:.]. 























■ 










su 


4SU 





25 U TdTaM; 1 nmol FITC-ddU' T 



M.0% 




NOCIAP SUCUP 9UCU^ STUOAP 

CUP TimMmh during ONA* I (III) 



fig. 1B 
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■ MoTrtst . 
!■ 1 UClApj 
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Post-Hybridization End Lab elling on the Chip 



Multiplex PCR 

CesomieONA 
FCR Buffer 

FragmentatioD 

DNAsti 
I 

Hybridization 

SSFE 
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EndLabfUing 

FITC-dUTP 
dATP 
TdTue 
CaQ2 

KCacodylite 




Fig. 19 
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Fig, 21 
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OligodT Labelling on ih 



MoltipkxPCR 

1 




Fragmentation 

1 




Hybridizatton 




.2 








W«fa/Dnia 












1 




1 






mmmL 







SaMtneFITCwtth: 
•RhnrtimhieiUIO 

•cyf 



Chip 



Oligo Synthesis 

tfTa ■ ■ 
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la^elim^ AffA* Glare 



ifnidssols 



Fig. 23^ 



Mb Wm 



Br O 



«>tri8Z n»-3,5r2H,4H)-dlQn© 
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-XT' 

NQIH MOH 



base - hdtcrocyclic moiety (eg. 
andogs t mof ; 
'O'vv^ * li ikei* 
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Ffg^Y Rasequendng a target DMA molecule with a 
probes. 



! et of generic n-mer tUIng 



I.e. 4-mer probes 

Target 
Probe 1 : ACT6 ^ 



TGACATAGGACAGCGJ lAGGGA 



Probe 2: 
Probe 3: 

Probe 5: 



Probe 9: 



CT6T 
TGTA 
GTAT 
TATC 
ATCC 
TCCT 
CCTG 
CT6T. . . 



Fig. 2r Four Electronic Tiling Arrays are present on a 4 -mer Generic Array 



(4x3 = 12 "nearest neighbors" for 



i.e. Probe 5: Pos. #1 Pos. #2 Pos 

TATA^' TAAC TAOlC 

TATG TAGC TGTC 

TATC TACC TClC 

TATT TATC TTTC 



AGCT 
Basecail: C 
Target Position: 8 



AGCT 
T 

7 



AGQT 
G 
6 



etc 



each probe) 



Pos. #4 

AATC 
GATC 
CATC 
TATC 



AGCT 
T 
5 
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Fig 3,1? Base-Calling at the 8 position in the target. 

3th 

Target: TGACATA66ACA6C|GAA6GGA 

Base*Call 
T 
G 
C 
C 



Probe 5, Pos. 
Probe 6, Pos. 
Probe 7. Pos. 
Probe 8, Pos. 



3- s- 
TATC 
ATCC 
TCCT 
CCTG 



C is the winner! 



Fig 2? Base Vote Table: 



ll 1= 

Base Base Base ^ g * S 

Position Identity Vote SS S 


h is U 

u S (> s ^ s 
9 a • 0 05 

0 CO 0 CB 0 CO 

0 0 0 


a 


T 


T 


1 




1 


0 


1 


0 


6 


A 


A 


1 




1 


1 


1 


1 


y 


T 


T 


1 




0 


1 


0 


1 






■•^■-dr.."; 4^ 


0 


10 


T 


T 


1 




1 


0 


1 


0 


11 






1 




0 


1 


1 


1 


12 


T 


T 


1 




0 


1 


1 


1 


13 


C 


C 


1 




1 


0 


0 


1 


14 


6 


O 


1 




0 


1 


1 


1 


flO) Totals 
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Fi§ Mutation Detection by Intensity Comparisons 
"Bubble" formation 




Position of Probe in Targe 

Algorithms: 

'normalized ^ 'probe^(^'NN) 
'Difference * ('normalized.variant " 'nonlnalized.control ) 



('normalized.variant 'nc 
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Control 
Variant 



rmalized,control ) 



- Locally Normalized Intensit es track well. 

- Local Normalization Is sensitive to mutations. 
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Mr 

Fig. 30 Bubble Formation detection of m 
Normalized Intensity Comparison: 



i 

g| OJ 


Ward Imeniicy Cominriso 

2 « 10 14 18 22 25 20 24 


% 1 

. i 

38 42 46 50 g 


i 
§ 

H -OJ 


2 6 10 14 18 22 25 30 3^ 


> 38 42 46 so 3 
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itation in HIV genome 



Normalized Difference: 



wo 97/27317 



PCT/US97A1603 



Fig Bi induced Difference Nearest Neighb{>r 



Reference Sampl^ 

substitution position: f ^ 

TATC >. TATC 

TGTC \ TGTC 

TCTC TCTC 

TTTC TTTC 



' Probe Scoring: 



AGCT 
BaseCali: Q 



AGC 
G 



Induced Difference: D/^ = (lv,A*'c i\)"c a) 

•Average induced differences over all filings and over 
both forward and reverse strands. 



Da->T 




• Probe with A 

• Probe with T 

• A ~> T mutat 



• Total Induced Difference > + Threslhold 
Total Induced Difference < - Threshlold 
' Two criteria for mutations: Induced 

Bubble 



"down-regulated" 
"up-regulated" 



on 



: Mutation Exists 
: Mutation Exists 

Difference Scores 
Formation 
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Flg^^jt, Mutations found in an HIV PGR tar^at (B) uaing a Genarie 
Ligation GeneChip and tnduead DIf erance Analyaia 



a c t 9 t a t c cj^^a gcttccctcagatca 



mm 



actgtatcctttaacttcccteagatca|ct 



lattagaagaaatgagtt tgccaggaag 



lajitalqatc aga t aj: a t^a^oa^^jt^jt^ 



lagtatgatcagatactcatagaaatctg 



[agaaatt 1 gtacagaaatggaaaagga a 



agaaatttgtacagrgatggaaaaggaa 



^^^^^^^^jjj^j^^^^^a^^^^^^aaaa 



cat cccgcagggi cmaaaaagaaaaaat 
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c t 



t I 
t9 



t 9 



99 



99 



[ttagatgaaga e^^ag^7^'^^TTg[^ 



_A3^AA^AA3 aatecaqa 



tagagecttttagataacaaaatccaga :a 




a t a g a ^ j ^^p a at t g a g^j^^^ajct 

{caaaaatagaggagctgagaeaacatct 



t 



Ic tq t t qaoqTQOo q~at t t a c c a c c a | i c 



Ictgt tgaggtggggactt ncTTT^ 



cag 



1 c 



CGAT 
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HIV HXB2-4. 12/19/96 expl. 



INTERNATIONAL SEARCH REPORT 



bienutiofuJ ippUcrtioo Ho. 
PCT/US97/01603 



A. CLASSinCATlON OF SUBJECT MATTER 

IPC(6) :C12Q 1/00. 1/68; GOTH 21A» 

USOL :43S/4. 6: 536Q2.1 
Agcoidiogio toieinaBBal Htm, rhwiftrtlinn (IPC) or to both anioiMJ chwifiifittnn and IPC 



B. HELPS SEARCHBD 



U S. : 435/4.6; 53602-1 



APS. STN 



ked dorins (he imrnminnal mmcI) (namB of d 
acid, hybridization, airay, ligaso 



daU h—o md. where pncticabic, ■cardi icnns wed) 



C DOCUMENTS CONSIDERED TO BE RELEVANT 



« wkh sidlaiMM. when a p ptop iki e. of cbe idovanl pmifa 



Pre be 



LIPSHUTZ et al. Using Oligonucleotide 
Access Genetic Diversity. Biotechniques. 
Vol. 19« No. 3, pages 442-447, see entire dc|cunrmnt. 



Seiitember 



Arrays To 
1995. 



24-26, 28-32, 
48 



SOUTHERN et al. Analyzing and Comparing 
Sequences by Hybridization to Arrays of 
Evaluating Using Experimental Models. 
13. pages 1008-1017. see entire document 



Nucleic Acid 
Oljgonucteottdes: 
.1992, Vol. 



.Genoir Ics 



EPO 320 308 A2 (BIOTECHNICA INTERNATK^NAU INC.) 14 
June 1989, columns 1-2. 

EP 0 336 731 A2 <CITY OF HOPE) 11 Octolber 1989. see 
abstract. 



1-23, 27, 37^1 

24-26, 28-32, 
48 

1-23, 27, 37-41 
1-23, 27. 37-41 

1-23, 27, 37-41 



□ 



»liMedinthDooalimiitioaorBoxC. Q 




Dtfeofttoacttal 
01 MAY 1997 



of the intonaimMl Mwch 



Dateof Butlini Df tfac iaiciBttiosi] icftreh report 

'1 4 MAY I9g7' 



Name nd iMilinK mMich oftbe ISAA/S 
mBiamamr JhmmmtJ 

rcT 

bt^a.D£. 30211 
FaairoOe Mo. (708) 3Q5.323D 



JEZIARO 
Tdophooc No. 




fxo)3oaH>6<yr/ 



Pom PCT/ISA/210 (noood ■lieet)(fdy 1992)* 



DCTERNATIONAL SEARCH RETORT 


Tnfrfnatinnal applirartrw Wo. 
PCT#US97A»Iot» 


BOX U. OBSERVATIONS WHERE UNrfY OF INVENTION WAS LACKING. 
Thb ISA Jbuod miibvlB inMba M IbO^ 

TIUi milkiTinn mnfihii rhn IhnoTfin iiiwrifinni nr prnqn -f i ■ " 
inveritive cooflcpi under per Ride 13.1. 

QfiMpL CbfaBi l-32.3T41.Md4l.dntmto*iartfaiMloriiybfidi» 
mjaadakiL 

GfVHpn. Clum3306.dnMiDancatodora»aitorinf feneopittt 
Groypin. CUBi42-45.dim«mto«iiicthodoriibdiasainiekieMid. 
Graop IV. Oaimi 46 and 47, dim to a cowpouad. 

•od it eoatkten that the latcnttiaaal Appl^^ 
13.t, 13.2 aad 13.3) for the reasons tndicaled bchm: 

Tbe ovcatVM Iktod u Cfotv* l-^V do not rctaie to a nag^ 

PCX Rule 13.2. Ibey lack the nme or auirMjioaAiMg ipedal ledttdcal feaura for 

InveokktM 1 txMiiiMiaci Ibe apccial leehaioal fiealuro (a ouclde acid array) 
tcchnk^ ftatuR* aiid the te nic«lM>d of imgtf^ 
of uae the i|Md>l iMtakU fe«bne, and inwaiioa N 
fntoieof bitcttioikl. 


: not ID liakBd ai to form a liacfe 
ion. nwrtiod of makac a imelde add 

DR. 

iBCflta of uniQr of ismiiioa (RttleB 

tar KT Rak 13.1 became, under 
Ihe following reasoiu: 

, the method of attktng the tpeeial 
ewcotiooa n-in ai« difieicm, nedsoda 
« miaiiened the ipecial technical 
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NUCLEIC ACID ANALYSIS ipECHNIQUES 

OftOSS REFERENCE TO RELATED 

This is a amlinuation-in-pait of U.S.S-N. 
1996 and a continuation-iD-pait of provisional patent 
Acids" naming Lockhaxt, Cronin.Lec.Tran, Matsuzaki, 
filed on Januaiy 9, 1997. both of which aie herein 
purposes. 



APPLICATIONS 

60/010.471 filed on Januaiy 23. 

n for "Labeling of Nucleic 
McGall and Barone as inventors, 
reference for all 



incori orated by I 



BACKGROUND OF THE INVENTION 
A portion of the disclosure of this pat 
is sufcjecl to copyright protection. The copyri^ owner 
iqffoduction by anyone of the patent document or the 
it appeals in the Patent and Trademark Office patent file 

all copyright rights whatsoever. 

Many disease stales are characterized by 
of various genes either through changes in the copy 
changes in levels of transcription {e.g. through control 
precursors, RN A processing, eta) of particular genes, 
genetic material p]sy an important lolc in malignant 
These gains and losses arc thought to be "driven" by 
Oncogenes are positive regulators of tumorigencsis, 
negative icgulators of tumorigenesis (Marshall, Cc/i, 
Science, 254: n3«-l 146 (1991)). Therefore, one 
growth is to increase the number of gows coding for 
level of expression of these oncogenes (e.g. in re^n^ 
changes), aod another is to lose genetic material or to 
genes that code for tumor suppressors. Husmodclis 
genetic material associated with glioma progression 
46:3-8(1991)). Thus, changes in the expression 



lent contains material which 
has iko objection to the xnographic 
patent {Closure in exactly the form 
or records, but otherwise reserves 



differences in the expression levels 
of the genetic DNA or through 
I >f initiation, provision of RN A 
or example, losses and gains of 
tnjnsformation and progression, 
two kinds of genes, 
tumor suppressor genes are 
313-326 (1991); Weinberg, 
of activating unregulated 
dncogene proteins or to increase the 
to cellular or environmental 
decrease the level of expression of 
supported by the losses and gains of 
( iAikkelson et at J. Cell Biochem. 
(trai ascription) levels of particular genes 



nuiiberc 



at least 1 



Wiilel 
64: 



roeciamsmc 
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1 levels 



f the elciv ited 



: respective 



I of such active 



(e.g. oncogenes or tumor suppressorsX serve as signposts 
of various cancers. 

Similarly, control of the cell cycle and cell 
are characterized by the variations in the transcription 
5 example, a viral infecdoo is often characterized by 

particular virus. For example, outbreaks of Herpes simplex , 
{e.g. infectious mononucleosis), cytomegalovirus. Varicella 
parvovirus infections, human papillomavirus infections, 
elevated expression of various genes present in the 
10 exfffession levels of characteristic viral genes provides an 
disease state. In particular, viruses such as herpes simplex, 
periods of time only to erupt in brief periods of rapid rcplii 
levels of characteristic viral genes aUows detection 
presumably infective) states. 
15 The use of "traditional" hybridization proto x>ls 

quantifying gene expression is problematic. For example 
a^iToximately the same naolecular weight will prove 
in a Northern blot because they are not 
Similarly, as l^bridizBti<m efficiency and ooss-ieactivity 
20 subsequence (region) of a gene being probed it is difficult 
measure of gene expression with one, or even a few, 

The developmem of VLSffS™ tedmolog) 
synthesizing arrays of many different oligomideo^ 
surface area. See U.S. Patent No. 5,143,854 and PCT 
25 90/1 5070. U.S. Patent application Serial No. 082,937. 

methods for making arrays of oligonucleoftidc probes that 
complete sequence of a target nucleic acid and to 
containing a specific nucleotide sequence. 

Previous methods of measuring nucleic 
30 changes in the expression of various genes (e-g., 

sequencing, clone ^tting, etc) require assumptions 
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f( r the presence and progression 



<|evelopment, as well as diseases, 
of particular genes. Thus, for 
expression of genes of the 
Epstein-Barr virus infections 
•zoster virus infections, 
are all characterized by 
virus. Detection of elevated 
Effective diagnostic of the 
enter quiescent states for 
tijation. Detection of expression 
proliferative (and 



t readily separated b f electn^dioretii 



for monitoring or 
1 wo or more gene products of 
diffiipilt or imposrable to distinguish 
ic ro^faods. 
/aries with the particular 
to obtain an accurate and reliable 
s to the target gene, 
provided methods for 

a very small 
pat^ publication No. WO 
June 25, 1993, describes 
can be used to provide the 
of a nucleic add 



itide pro! )es that occupy c 



,fUxl 



detect tlie presence c 



^acid 



abundance difSerences or 
differeniial diaplay, SAGE, cDNA 
3 abqut, or prior knowledge regarding 
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the taiget sequences in order to design appropriate scquen ^c-i 
methods, such as subtractivc hybridization, do not require 
also do not directly provide sequence information rcgaidL ig differentially 
nucleic acids. 



Summary ftf «hg Invent^ m 



(refeied 
pio\ides 



acid 
Icvds 



Th se 



r differ betwa n the 



The present invention, in one embodiment 
the expression of a multiplicity of preselected genes 
monitoring"). In another embodiment this invention 
differences in the con^ositions of two or more nucleic 
Where the nucleic acid abundances reflect expression 
which the samples are derived, the invention provides a 
in expression profiles bewteen two or more samples, 
methods" are lairid, simple to apply, lequiic no fl 
particular sequences whose expression may 
direct sequence information regarding the nucleic 
the samples. 

In one embodiment, this invention 
diffetcnces in nucldc acid levels between two or more 
involves the steps of: (a) providing one or more ol 
comprising probe oUgonucleotidcs attached to a surfecc; 
san^les to said one or more arrays to form 
nucleic add samples and probe oligonucleotides in said 
complementary to said nucleic acids or subsequences 
more anays with a nucleic acid ligase; and (d) 
between said nucleic add samples whcrcm 
differences in said nucleic acid levels. 

In another embodiment, the method of ii 
arid levels between two or more nuddc acid samples i 
one or more oligonudeotide arrays comprising probe 
oligonucleotides conqxise a constant region and a 



provides methods of monitoring 
to herein as "e xpr e ssi on 
a way of identifying 
(eg.. RNA or DNA) samples, 
in biological samples from 
4tethod for identifying differences 
generic difference screening 
the 

two samples, and provide 
abundances differ between 



I said differei ces 



vcfiussrfnim 



.q)ecific probes. Other 
prior sequence knowledge, but 



acids ^ /hose 



1 pro vidles a method of identifying 

n jcleic acid samples. The method 
iligoni; cleotide arrays said arrays 

(b) hybridizing said nuddc acid 
in said 
one or more arrays that are 

contacting said one or 
differences in hybridization 
in l^bridization indicate 



hybrid duple ces between nuddc acids i 



th 3xof;(c) ( 



I detctmin ng 



i( entiiying differences in nudeic 
ijivolvcs the steps of: (a) providing 
;<mudeotides wfaeicui said piobe 
varidMe region; Cb) hybridizing sud 
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hyi nid 



detcininingt 



nucleic acid samples to said one or more azxays to form 
acids in said nucleic add samples and said variable region 
nucleic acids or subsequences thereof, and (c) detennining 
between said nudck acid samples wherein said difference 
dififerences in said nucleic acid levels. 

In yet another embodiment, the method 
acid levels between two or more nucleic acid samples involves 
one or more high density oligonucleotide arrays; (b) hybri lizii 
to said one or more arrays to form hybrid duplexes betwee i 
acid san^jles and probe oligonucleotides in said one or 
to said nucleic acids or subsequences thereof; and (c) 
hybridization between said nucleic acid samples herein 
indicate differences in said nucleic acid levds. 

In still yet another embodiment, the method 
nuddc acid levels between two or more nucleic acid san^ijli 
providing one or more oligonucleotide arrays each co 
wherein said probe oligonucleotides are not chosen to hybHdize 
fiom particular preselecled genes or mRNAs; (b) hybridiajmg 
said one or more anays to form fayMd duplexes between 
samples and probe oligonucleotides in said one or more 
said nucleic acids cx subsequences thereof; and (d) 
hybridization between said nucleic acid samples wherein 
indicate differences in said nucleic add levels. 

In another embodimem, the methods of ii 
acid levels between two more nucleic add samples involves 
one or more oligonucleotide airays each comprising probe 
pzobe oligonudeotides comprise a nucleotide sequences 
aoocMding to a process selected from groiqt consisting lof 
hsqphazard selection, a nudeotide composition biased 
oligonucleotides ofa preselected length; (b) hybridizing 
one or more arrays to fonn hybrid duplexes between 



of ijlentifying differences in nuddc 
the steps of: (a) providing 
izing said nucldc acid samples 
nuddc adds in said nucldc 
anays that are complementary 

tht differences in 
differences in hybridization 



ar "ays 



^deteraaining 



1 nud »c 
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id duplexes between nucldc 
that are complementary to said 
differences in hybridization 
in hybridization indicate 



Slide 



of identifying difiennces in 
ies involves the steps of: (a) 
probe oligonucleotides 

to nuddc acids derived 
sdd nuddc acid samples to 
i^ucldc adds in said nucldc add 
that are complementary to 
differences in 
^d differences in hybridization 



identifying differences in nucleic 
the steps of: (a) providing 
oligonudeotides wherein said 
subsequences selected 
a random selection, a 
and all possible 
I (aid nucleic acid samples to said 
adds in said nuddc add 



sele :tion, i 



tbetwon 
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samples and probe oligonucleotides in said one or more 
said nucleic acids or subsequences thereof; and (c) 
between said nucleic acid samples wherein said differenc^ 
differences in said nucleic add levels. 

In another embodiment, the methods of ii 
acid levels between two or more nucleic acid samples 
providing one or more oligonucleotide airays each 
wherein said probe oJigonucIeotides comprise a nucleotide 
selected according to a process selected from the group 
haphazard selection* a nucleotide composition biased 
oligonucleotides of a preselected length; (b) providing 
sequence of probe oUgonucleotides on said anay; (c) hybi 
to said one <v more arrays to finm hybrid duplexes 
acid samples and probe oligonucleotides in said one or 
to said nucleic acids or subsequences thereof; and (d) 
hybridiadng indicates diffeiences in said mideic acid 

This invention also provides 
expression ofamultii^ty of genes. In one embodiment] 
providing a pool of taiget nucleic acids comprising RN A 
genes, or nucleic acids derived fiom said RNA transcripts , 
nucleic acids to an oligonucleotide anay 
on a sui&ce; (c) contacting said oligonucleotide array wit(i 
hybridization of said nucleic adds to said array v^toein 
measure of the levels of transcription of said genes. 

Still yet another method of identifying 
between two or more miclric acid samples involves the 
arrays of oligonudeotides each anay con^irising pairs 
members of each pair of probe oUgonucleotides differ 
nucleotides; (b) hybridizing said nucldc acid samples to 
hybrid diqilexes between nucldc adds in said nuddc adc i 
oUgonucleotides in said one or more aziays that are 
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a rays that are complcmentaiy to 
detem[ining differences in hybridization 
in hybridization indicate 

id< ntilying differences in nucleic 
inv >lve the steps of: (a) 
compr sing probe oligonucleotides 
sequence or subsequences 

of a random selection, a 
and all possible 
describing the location and 
idizing said nucldc acid samples 
nuddc adds in said nucleic 
i airays that are oomplementaiy 
said software such that said 



ccnsistingc 



sde :tion, s 
soitwarec 



opcatings 
1 level 5. 



I methods of si nultaneously 



rfroin 



monitoring dse 
these methods involve (a) 
I ranscripts of one or more of said 
(b) hybridizing said pool of 

immobilized 
a ligase; and (d) quantifying the 
quantifying provides a 



comprising prob ; oUgooudeotides 



Slid 



difi^rences in nuddc add levels 

of: (a) providing one or more 
of ^be oligonudeotides ^i^iere the 
each other in preselected 

one or more arrays to form 
samples and probe 

to sdd nucldc adds or 



sdidt 



comp ementary t 
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I ifidu ate 



SOie 



subsequences thereof^ (c) detennining the differences in 
acid samples v^ierein said difierences in hybridization 
acid levels. 

Another method of simultaneously monitorii|tg the 
5 multiplicity of genes, involves the steps of: (a) pioviding 
airays comprising probe c^gonucleotides i^ieiein said 
constant region and a variable region; (b) providing a pool 
comprising RNA transcripts of one or more of said genes, i 
said RNA transcripts; (c) hybridizixig said pool of nucleic 

10 oligonucleotide probes immobilized on a sur&ce; and 
said nucleic acids to said array wherein said quantifyu 
txanscription of said genes. 

This invention additionally provides method i 
for identifying differences in nucleic acid levels between 

15 samples. In one embodtmetit the method involves thesteps {if: 
oligonucleotide array comprising probe oligonucleotides 
oligonucleotides comprise a constant region and a variable 
more of said nucleic acid saiiq>les to said anays to fomi 
regkm and nucleic adds in said nucleic acid samples 

20 complementary to said variable region; (c) attaching the 

said hybrid duplexes to said array of probe oligonucleotides ; 
micleic adds to provide a high density oligonucleotide arra; 
attached to said anay. 

In another embodiment the method of makinb 

23 identifying differences in nucleic acid levels between two o 
involves the steps of; (a) providing a high density array; (b 
more of said two or more nucleic add samples whereby 
two or more nudeic add samples fonn hybrid duplexes 
arrays; (c) attaching the sample nucleic acids comprising 

30 of probe oligonucleotides; and (d) removing unattached 

density oligonucleotide array bearing sample nuddc adds 
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hy|>ridization between said nucldc 
differences in said nucleic 



wiereins 



t compi sing 



expression of a 
or more oligonucleotide 
protie oligonudeotides comprise a 
<|f target nucleic acids 
nucleic acids derived from 
to an airay of 
(d) qt|antifying the hybridization of 
ying projvides a measure of the levels of 

of noaking a nucldc add array 
» or more nucldc add 
(a) providing an 
said probe 
1 egion; (b) hybridizing one or 
hyl rid duplexes of said variable 
subsequences 
[e nucleic adds comprising 
and (d) removing iinattarhrd 
bearing sample nucldc acids 



a nucleic add airay for 
more nucleic acid samples, 
contacting said anray one or 
adds of said one of said 
wit|i probe oligonucleotides in said 
hybrid diq>lexes to said array 
adds to provide a high 
I iltached to said anay. 



niM Idc f 
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I bigb density oiigoniicleotide am fs of this 



This invention additionally provides kits fc r 
described herein. One kit comprises a container containing 
arrays said arrays comprising probe oligonucleotides attac tied 
containing a ligase. Another kit comprises a container 
oligonucleotide airays said airays comprising piobe o; 
oligonucleotides comprise a constant region and a variabh : 
includes a constant oligonuclekide complementaiy to said 
thereof. 

Preferred 

more than 100 different probe oUgonucleotides wherein: 
oligonucleotide is localized in a predetermined region 
oligonucleotide is attached to a surface through a terminal 
of said probe different oligonucleotides is gyeater than 
per 1 cm'. The higih density arrays can be used in all 
herein. High density arrays used for expressio monitoring 
oligonucleotide probes selected to be axnpleinentaiy to a 
more preselected genes. In contrast, generic ctiflerence 
oligonucleotides selected randomly, haphazardly, 
subsequences comprising all possible nucleic acid 



iligo^ucleotides wherein said probe 
region. This kit optionally 
x>nstant region or a subsequence 



loftie 



, aibhrar ly, 



lsequen:es 



nbodiment, pools of oligoi ucleotides 



In a preferred emi 
subsequences comprising all possible nucleic adds of a 
die group consisting of all possible 6 mers, all possible 7 
possiDle 9 mers, all possible 10 mers, ail possible 1 1 mer^. 

This invention also provides methods 
embodiinent, this medwd involves the stqis of: (a] 
amplifying said nucleic acid to form amplicons; 
fragments of said amplicons; and (d) coiq|>liQg a labeled 
fragments. 

In another embodiment, the methods involve 
nucleic acid; (b) transcriUng said nucleic add to 



; (c) fiagn eoting 
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practice of the methods 
one or more oligonucleotide 
to asur&ce; and a container 
Ltaining one or mm 



mvention comprise 
ijpch different probe 

array; each different probe 
covalent bond; and the density 
abciut 60 different oligomideotides 
; array-based methods discussed 
will typically include 
nucleic acid derived from one or 
arrays may contain probe 
; or induding sequences or 
of a particular (preselected) 



scneeninga 



or oligonucleotide 
particular length are sdected from 

ners, all possible 8 mers, all 
and all possible 12 mers 
of labeling a nucldc acid. In one 
i) piovi liiig a nuddc acid; (b) 

said amplicons to fotm 
rlioiety to at least one of said 



the steps of: (a) piovidiiig a 
nuddc acid; (c) 



formed a transcribed 



wo mm? 



8 



:acil; 



fragmentiiig said transcribed nucleic acid to fonn fiagmenis 
acid; and (d) coupling a labeled moiety to at least one of sa d 

In yet another embodiment, the methods involve 
at least one nucleic add coupled to a support; (b) providing 
being coupled with a terminal cxansferase to said nucleic 
tiansfense; and (d) coupling said labeled moiety to 
transferase. 

in still another embodiment, the methods invblve 
at least two niicleic adds coupled to a sappoit; (b) increasing 
of said nucldc adds to fonn a common nuddc acid tail on 

(c) providing a labeled moiety capable of recognizing said 

(d) contacting said common nucleic add taib and said labeljsd 

In still yet anodier embodiment, the methods 
providing at least one nucleic add coupled to a 
capable of being coupled with a ligase to said nucleic add; 
(d) coiqiling said labeled moiety to said nucldc acid using 

This invention also provides con^wunds 
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of said transcribed nucldc 
fragments, 
the steps of: (a) providing 
a labded moiety capable of 
(c) providing said terminal 
said nuclide add using sdd tenninai 



I support; (b) providing 



said 1 



soft be 



An array of oligonucleotides as used herein 
difterem (sequence) oligonucleotides attached (preferably 
covalem bond) to one or more solid supports where, when 
supports, each support bears a multiplietty of oligomideotiAes. 
to the entire collection of oligonucleotides on the siipport(! 
term "same array" when used to refer to two or more array! 
have substantially the same oligonucleotide species thereoi 
abundances. The spatial distribution of the oligonucleotide 
two arrays, but, in a preferred embodiment, it is substantia] l; 
that even where two arrays arc designed and synthesized tc 
in the abundance, compositioQ, 



, and distribution of oligonu deotide probes. 



the steps of: (a) providing 
the immber of monomer units 
iaid at least two nucldc adds; 
c onunon nucleic acid tails; and 
moiety, 
involve the steps of: (a) 

a labeled moiety 
[c) providing sdd ligase; and 
ligase. 

formulas described herein. 



] efers to a multiplicity of 
t m>ugh a single terminal 
1 here is a multiplicity of 

The tenn "array" can refer 
or to a subset thereof. The 
b used to mean arrays that 
in substantially the same 
species may differ bet we e n the 
[y the same. It is recognized 
be identical there are variations 
These 
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variations are prefeiably insubstantial and/or compensaty 
described hereia 

The phrase "massively parallel screening' 
screening of at least about 100, ineferaUy about IQOO, 
most preferably about 1,000,000 dtfifeicnt nucleic add 

The tenns ''nucleic acid" or "nucleic acid 
deoxyribonucleotide or ribonucleotide polymer in either 
and unless otherwise limited, would encompass known 
can function in a simitar mannar as naturally oocuning 

An oligonucleotide is a single-stranded 
2 to about 1000 nucleotides, more typically from 2 to 

As used herein a "probe" is defined as an 
to atarget nucleic add of complemcntaiy sequence thiodgh 



m cleic t 
> about 



bonds, usually through oompkmcntaiy base pairing, usu illy through hydrogen 



formation. As used herein, an oligonudeotide probe ma^ ' 
T) or modified bases (7-dea2aguanosine, inosine, etc). 
oligonucleotide probe may be joined by a linkage otfier 
long as it does not interfere with hybridization. Thus, 
peptide nucleic acids in winch the constituent bases are j< 
phosphodiester linkages. 

The tenn "target nudeic acid" refers to a 
biological sample and hence lefetred to also as a sample 
oligonucleotide probe specifically hybridizes. It is reco] 
can be derived from essentially any source of nucleic 
to chemical syntheses, amplification reactions, forensic 
presence or absence of one or more target nudeic adds 
amount of one or more target mideic adds that is to be 
acid(s) that are detected preferentially have nucleotide 
to the nucleic add sequences of the conesponding probe 
(hybridize). The tenn taigetinicleic add may refer to thi 
nucleic add to ^ch the piobe specifically hybridizes. 
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for by the use of controls as 



refers to the simultaneous 
preferably about 10,000 and 
hi'bridizations. 
nolecule" refer to a 
dngle^ double-stranded form, 
a lalogs of natural nudeotides that 
niicleotides. 

lie acid ranging in length from 
500 nucleotides in length. 
I >ligomicleotide capable of binding 
one or more types of chemical 
bond 

include tuttural A, G, C, or 
addition, the bases in 
a phosphodiester bond, so 
oligonucleotide probes may be 
j nned fay peptide bonds rather than 



tl lani 



1 lucleic acid (often derived from a 
tmcleic add), to which the 
g sized that the target nucldc adds 
(e.g.. including, but not limited 
s^ples,e/c.) It is either the 
is to be detected, or the 
i^uantified. The target nucteic 

that are complementary 
s) to y^ch they specifically bind 
specific subsequence of a larger 
the overall sequence (e.^.. 



acils 



crtot 
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1 contc ict. 



gene or mRNA) whose abundance (concentration) and/or 
delect The difference in usage will be apparent from 

A "ligatable oligonucleotide'* or "iigatable 
oiigomicieotide |Bobe" lefen to an oligonucleotide that is 
another oligonucleotide by the use of a ligase {e,g„ T4 DN/ . 
oligonucleotide is prefbably a deoxyiibonucleotide. The 
ligatable oligonucleotide axe preferably the "standard" 
However derivatized, modified, or alternative nucleotides 
long as their presence does not interftre with the Ligation 
may be labeled <»- otherwise modified as loitg as the label 
ligation reaction. Similarly the intemudeotidc linkages can 
modification does not intedere with ligation. Thus, in some 
oligonucleotide can be a pqrtide nucleic acid. 

"Subseiiueoce'' refers to a sequence of nuclei^ 
a Icniger sequenoe of nucleic adds. 

A "wobble" refers to a degeneracy at a 
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e] [pression level it is desired to 



pobe* 



Citpable 



nucU Glides; 



or "ligatable 
le ofbeingligatedto 
ligase). The ligatable 
nihcteotides comprising the 
; A. G, C. and Tor U. 
inosine) can be present as 

The ligatable probe 
not interfere with die 
be modified as long as the 
instances, the ligatable 



retctton. 



dciesi 



partici tlar positio 



:refeis 



oligonucleotide. A fiiUy degenerate or "4 way" wobble 
acids oligonucleotide probes having A, G,C, or T for i>NA 
at the wobble positicm.) A wobble may be qjpioximated by 
nucleotide with inosine which will base pab with A, G, C, 
oligonucleotides containing a fully degenerate wobble 
of an oligonucleotide is prepared by using a mixture of four 
at the particular coupling step in which the wobble is to be 

The tenn"cn>ss-linking" when used in 
acids refers to attaching nucleic acids such that they aie not 
conditions that are used to denature complementary nucleic 
preferably involves the formation of covalenl linkages 
of cross-linking nucleic acids are described herein. 

The phrase **coupled to a support** means bo^md 
thereto including attachment by covalent binding, hydrogen bonding, 
hydrophobic interactioiu or otherwise. 



s between 



acids that comprises a port of 



on man 
to a collection of nucleic 
orA«G,C,orUforRNA 
the replacement of the 
TorU. Typically 

chemical synthesis 
lifTerent nucleotide monomers 
tlitnxhioed. 

to cross-linking nucleic 
: ieparated under typical 
(icid sequences. Crosslinking 
the nucleic adds. Methods 



pfxxii ccd during < 



irefererce 



directly or indirectly 
ionic interaction. 
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12 BO 



> ccmiplcme ataiy 



"Amplicons" are the products of the 

or otherwise. 

^Transcribing a nucleic acid** means 
fioffl a deoxyribonucleic acid and the converse (the 
fiom a ribonucleic acid). A nucleic acid can be transcribe 
polymense* reverse tnmscriptase. or otherwise. 

A labeled moiety means a moiety capable 
methods d i sc t issrd herein or known in the ail 

The term "complexity^is used here accord^g 
term as established by Britten et a!. Methods of Emymol. 
andSchimmel Biophysical Chemistry: Part III at 1228-; 
nucleic acid complexity. 

"BindCs) .substantially" refers to 
probe nucleic add and a target nucleic acid and embraces 
accommodated by reducing the stringency of the 
desired detection of the target polynucleotide sequence. 

The phrase "hytnidizing specifically to", 
hybridizing of a molecule preferentially to a particular 
conditions when that sequence is present in a con^lex 
RNA. The term "stringent conditions" refers to condition^ 
hybridi29B prefeirentially to its target subsequence, and to 
other sequences. Stringent conditions are 
dififerent circumstances. Longer sequences 
Generally, stringent conditions are selected to be about 
point (T^ for the specific sequence at a defined ionic 
temperature (under defined ionic strength, pH, and nuclei< ; 
50% of the probes complemeotaiy to the target sequence 
equilibrium. (As the target sequences are generally presei^ 
probes are occupied at equilibrium). Typically, stringent 
the salt conoentradon is at least about 0.01 to l.OMNaio 
pH 7.0 to 83 and the temperature is at least about 30*^0 
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amplj ficaiion of nucleic acids by PGR 



thefqnnationof a ribonucleic acid 
a deoxyribonucleic add 
by DNA-dependent RNA 



>f being detected by the various 

to standard meaning of this 
X9:363 (1974). Sec, also Can/or 
for further explanation of 



i hybridiz ition 



refers t 



hybridization between a 
minor mismatches that can be 
ion media to achieve the 



: sequence-<!epei dent and 
i hybridize spec ifically 



^ strei igth 



:f<r 



nui :leotide s 



mtcture 



to the Innding, duplexing, or 
sequence under stringent 
(e.g., total cellular) DNA or 
under which a probe will 
lesser extern to, or not at all to, 
will be different in 
at higher temperatures, 
lower than the thermal melting 
andpH. TheT.isthe 
acid concentration) at which 
] lybridize to the target sequence at 

in excess, at T^ 50% of the 
xmditions will be those in which 
conoentrBtio&(or other sahs) at 
short probes {eg., 10 to 50 
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piobe 



Tic 



isniatchi: 



.prefect natch 



peftcl 



nucleotides). Stringent conditions may also be achieved 
agents such as fonnamide. 

The tenn "perfect match probe" refeis to a 
perfectly compiementaiy to a particular target sequence. 
5 perfectly coii4)lemeiitaiy to a portion (subsequence) of the tfaiget 
match (PM) probe can be a **test probe**, a "normalization o^ntroi' 
level control probe and the like. A perfect match control or 
however, distinguished from a **mismatch controP or '*mii 
expression monitoring arrays, perfect match probes are typi^ly 
10 be con^lememary to paiticuhff sequences or subsequences 
particular genes). In contrast, in generic difference screening 
sequences are typically unknown. In the latter case, 
preselected. The tenn perfect match probe in this context is 

a r ^ fi^ «p r M M«f»g "mi w n i i trJi emdml" tfiatdiflfera from the 

15 particular preselected nucleotides as described below. 

The term "mismatch control" or ''mismatch 
monitoring arrays, refers to probes whose sequence is 
perfectly complementary to a particular target sequence, 
in a high-density may there preferably exists a coi 

20 that is perfectly complementary to the same particular targe 
random, arbitrary, haphazard, etc.) arrays, since the target 
perfect match and mismatch probes cannot be a 
In this instance, the probes are preferably provided as pairs 
in one or more preselected nucleotides. Thus, while it is no : 

25 probes in the pair is the perfect match, it is known that whe: i 
hybridizes to a particular target sequence, the other probe o 
control for that target sequence. It will be appreciated that 
probes need not be provided as pairs, but may be provided 
5, or more) of probes that differ from each other in 

30 While the xnxsroatch(s) may be located anywhere in the 

mig mntchag are less desirable as a terminal mismatch is \esk likely 



wi h the addition of destabilizing 



that has a sequence that is 
test probe is typically 
sequence. The perfect 
probe, an expression 
] «rfect match probe is, 

ptobe.** In the case of 
preselected (designed) to 
f target nucleic acids («.;.. 
arrays, the particular target 
probes cannot be 
to distinguish that probe from 
;t match in one or more 



deiibt rately 



Fo reach I 



orrcspondi] ig perfect 



n icleic i 



priori deta nined, designed, c 



l^be**, in expression 

selected not to be 
mismatch (MM) control 
match (PM) probe 
sequence. In "generic" (e.g.. 
acid(s) are unknown 
or selected, 
where each pair of probes differ 
known a priori which of the 
one probe specifically 
the pair will act as a mismatch 
1 he perfect match and mismatch 
larger collections {e.g., 3. 4, 
nucleotides, 
probe, terminal 
to prevent hybridization 



particuL a- preselected i 



mis natch I 
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3 acids and components of the oli gonucleotide 
f substrat etc), 

CO npooents t 



iprefcTcd 



fo-thel 
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of the target sequence, in a particularly preferred embodiment, 
or near the center of the piobe such that the mismatch is 
duplex with the taiset sequence under the test hybridizatijon 
preferred embodimenl, perfect matches differ from 
5 located nucleotide. 

The tenns '^background" or ''background 
hybridization signals resulting from non-specific binding 
the labeled target nucleic 
oligonucleotide probes, oootiol probes, the array 

10 also be produced by intrinsic fluorescence of the array 
background signal can be calculated for the entire array, 
may be calculated for each region of the array. In a 
calculated as the avenge hybridization signal intensity 
probes in the array, or region of the array. In expression 

IS probes are preselected to hybridize to specific nucleic 
background signal may be calculated for each target 
background ^gnal is calculated 
for the lowest l%to 10% ofthe probes fin- each gene, 
appreciate that where the probes to a particular gene 

20 specifically binding to a target sequence, they should not 
calculatioiL Alternatively, background may be calculated 
signal intensity produced by hybridizatiiw to probes that 
sequence found in the sample {t.g. probes directed to nuc 
to genes not found in the sample such as bacterial genes 

25 origin). Background can also be calculated as the averagr 
regions of the array that lack any probes at all. 

The term "quantifying" v^ien used in the 
abundances or cottcentrations {eg. , transcription levels o 
to relative quantification. Absolute quantification may be 

30 known concentratxon(s)ofoiie or more target nucleic aciqs 
as BioB or with known amounts the target micleic acids 



the mismatch is located at 
] nost likely to destabili2e the 
conditions. In a particularly 
controls in a single centrally- 



:aciils 
t nucli ic 

I for each target gene, the I ackground signal 
Ol course, c 
t hybr di: 



ignal intensity** refer to 
or other interactions, between 
aiTay(e.g:, the 
Background signals may 
themselves. A single 
a differem background signal 
embodiment, background is 
lowest 1% to 10% of the 
I tionitoiing arrays (Le., where 
(genes)), a different 
acid. Where a dififerent 

is calculated 
cue of skill in the ait will 
zc well and thus appear to be 
)e used in a background sigzsal 
as the average hybridization 
j irc not complementary to any 
eic acids of the opposite sense or 
\ fhcn the sample is of mammalian 
signal intensity produced by 



c Mstext of quantifying micleic acid 
'a gene) can refer to absolute or 
accomplished by inclusion of 
(e.g. control nucleic acids such 
themselves) and referencing the 
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r subsequences 



d^enniningt 



ramuo 



hybridization intensity of unknowns with the known caiget 
generation of a standard curve). Alternatively, relative 
by connparison of hybridization signals between two or mor^ 
more treatments to quantily tlw changes in hybidization 
transcription level. 

The^percentage of sequence identity" or 
by comparing two optimally aligned sequences or 
window or span, wherein the portion of the polynucleotide 
window may optionally comprise additions or deletions (/.e. 
reference sequence (which does not comprise additions or 
of the two sequences. The percentage is calculated by 
at which the identical subunit (e.g. nucleic add base or 
sequences to yield die number of matched positions, dividing 
posttions by the total number of positions in the window of 
the result by 100 to yield the percentage of sequence identit]^. 
when calculated using the programs GAP or BESTFIT (see 
default gap weights. 

Methods of alignment of sequences for 
art. Optimal aligimicnt of sequences for comparison may bt 
homology algorithm of Smith and Waterman, Adv. AppL 
homology alignment algorithm of Needleman and Wunsch 
the search for similarity method of Pearson and Lipman, 
2444 (1988), by computerized implementations of these 
limited to CLUSTAL in the PC/Gene program by 
California, GAP, BESTFIT, FASTA, and TFASTA in the 
Package, Genetics Computer Group (GCG), 575 Science Di 
or by inspection. In particular, methods for aligning 
pi ogram are well described by Higgins and Sharp in Gene, 
CABIOSSi 151-153 (1989)). 
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1 ludeic adds (e.g. through 
quaijtirication can be accomplished 
genes, or between two or 
intjensity and, by implication. 



r intelligen rtics. 



(sequenxs 



'secfuence identity** is detennined 
over a comparison 
siequence in the comparison 
8q») as compared to the 
deletions) for optimal alignment 
the number of positions 
acid residue) occurs in both 
the mmber of w* a! rhftd 
mqatison and multiplying 
Percentage sequence identity 
)elow) is calmlaird using 

compjarison are well known in the 
conducted by the local 

2: 482 (1981). by the 
MoL Biol. 48: 443 (1970). by 
NatL Acad ScL USA 85: 
alg^thms (including, but not 
Moutain View. 
V|^isc(msin Genetics Software 
, Madison, Wisconsin, USA), 
using the CLUSTAL 
237-244 (1988) and in 



,Prttc. 
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BRIEF DESCRIPTION OF THE I RAWINCS 



^unhybrilized 



Alternatives 



t;xtl 



hi^ 1 density t 



Fig. I shows a schematic of expression 
airays. Extracted poly (A)* RNA is converted to cDNA, 
presence of labeled riboDiideotidc triphosphates. Lis 
5 fluorescein. RNA is fragmented with heat in the presence 
Hybridizations are carried out in a flow cell that contains 
axrays. Following a brief washing step to remove 
scanned using a scanning confocal microscope, 
directly labeled without a cDNA intennediate are describeji 

10 analysis software converts the scanned array images into 
intensities at specific physical locations are associated witfi 

Fig. 2A shows a fluorescent image of a 
16,000 different oligonucleotide probes. Hie image was 
(15 hours at 40'*C) of biotin-labeled randomly fragmented 

15 murine B cell (TIO) cONA Ubraiy, and spiked at the level 
about 100 copies per cell) with 13 specific RNA targets, 
indicative of the anwunt of labeled RNA hybridized to 
probe. Fig. 2B shows a small portion of the array (the 
probes for IL-2 and IL-3 RNAs. For comparison. Fig. 2C 

20 the array following hybridization with an unspiked TIO 

express IL-2 and IL-3). The variation in the signal iittensi^ 
reflected the sequence dependence of the hybridization 
the four comers of the array contain a control sequence 
biotin-labeled oligonucleotide that was added to the 

25 concentration (50 pM). The sharpness of the images near 
was limited by the resolution of the reading device (1 1 .25 
resolution of the array synthesis. The pixels in 
were systematically ignored in the quantitative analysis 
Fig. 3 provides a log/log plot of the 

30 PM-MM intensity d iffe ren ce s for each gene) vcnus 

targets. The hybridization ^gnals were quantitativeiy 



mo litoring i 



I the bonier egii 



of the 



i hybridi satioi 



rrelf ted 
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using oligonucleotide 
4iuch is then transcribed in the 
oradyesuchas 
of magnesium ions. 
1 be two-dimensional DNA probe 
RNA, the arrays are 
in which cellular mRNA is 
in the Examples. Image 

fDes in which the observed 
particular probe sequences, 
array containing over 
o^Mained following hybridization 
RNA transcribed from the 
1 :3,000 (50 pM equivalent to 
rhe brightness at any location is 
lar oligoiuicleotide 
region of Fig. 2A) containing 
shows shown the same region of 
samples (TIO cells do not 
was highly reproducible and 
The central cross and 
is complementary to a 

ion solution at a constant 
he boundaries of the features 
im) and not by the spatial 
ions of each synthesis feature 
images. 

^n intensity (average of the 
ion for 1 1 different RNA 
to target conoattration. The 



ofl 



theparticula 



£ boxed 



RNA 



efl ctencies. 



tibit 



: hybri<izati( 
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experiments woe perfonned as described in the Examples 
cytokine RNAs (plus bioB) were spiked into labeled TIO RWa 
1 :300,000 to 1:3,000. The signals continued to inciease with 
frequencies of 1 :300, but the response became sublinear at 
5 ofthe probe sites^llie linear range can be extended to high ar 
shorter hybridization times. RNAs from genes expressed ii 
GAPDH) were also detected at levels consistent with result \ 
libraries. 

Fig. 4 shows cytokine mRNA levels in the 

ID difTerent times following stimulation with PMA and a 

was extracted at 0, 2, 6, and 24 hours following stimulation 
stranded cDN A containing an RNA polymerase promoter, 
transcribed in the presence of biotin labeled ribonucleotide 
fayhiidiased Id the oligonucleotide probe anays for 2 and 22 

IS intensities were converted to RNA frequencies by comparis 
a bacterial RNA (biotin synthetase) ^iked into the samples 
hybridization. A signal of 50,000 corresponds to a 
to a frequency of 1:5,000, and a signal of 100 to a frequencjf 
IL-4, IL-6, and IL-i:4>40 were not detected above the level 

20 these experiments. The error bars reflect the estimated 

for a given RNA relative to the level for the same RNA at z 
relative uncertainty estimate was based on tfie results 
on repeated measurements of lL-10, P^ictin and GAPDH 
TIO and 2D6 cells (unstimulated). The uncertainty in the 

25 message^to-message difTerences in the hybridization 

inRNA isolation, cDNA synthesis, and RNA synthesis and 
in the absolute frequencies is estimated to be a foctor of thr^. 

Fig. 5 shows a fluorescence image of an 
dififerent oligonucleotide pn^xs for 1 18 genes. The i] 

30 overnight hybridization of a labeled murine B cell RNA 

region is 50 x SO |im and contains 107 to 108 copies of a s^ifr 
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1 herein and in Fig. 2. The ten 10 
at levels ranging from 
increased concentration up to 
he high levels due to saturation 
ooticentrations by using 
TIO cells (IL-IQ, p-actin and 
obtained by probing cDN A 



1 efRcie ncy 



laiTiy 



;unage was 



2D6T helper ceU line at 
ionophore. Poly (A)^ RNA 
and converted to double 
rhe cDNA pool was dien 
riphosphates, fragmented, and 
Murs. The fluorescence 

with the signals obtained frir 
at known amounts prior to 

iximately 1:100,000 
of 1:50,000. RNAs for IL-2. 
of ^yproximately 1 :200,000 in 
(25 percent) in the level 
different time point The 
of redeated q)iking eiq)eriments, and 
R|NAs in preparations from both 
a|)Solute frequencies includes 
as well as differences in the 
labeling steps. The uncertainty 



fiequen y of approx 



unoalainty ( 



containing over 63,000 

obtained following 
ie. Each square synthesis 
ic oligcmucleotide. The 
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re^on 



of a t ppical computer system used to 



of monitoring the expression 
and mismatch 



of pairs o r perfect match 



if deten lining if a gene is expressed 



array was scanned at a resolution of 7.S ^m in approxima^ly 
indicate RNAs present at high levels. Lower level RNAs 
based on quantitative evaluation of the hybridization patterns, 
were detected at levels ranging from approxhnately 1 :300 
center, the checkerboard in the comers, and the MUR-1 
complementary to a labeled control oligonucleotide that 

Fig. 6 shows an example of a computer system 
of an embodiment of the i»esent invention. 

Fig. 7 shows a system block diagram 
execute the software of an embodiment of the present invdntion. 

Fig. 8 shows the high level flow of a proce: s 
of a geoe by comparing hyteidization intensities 
probes. 

Fig. 9 shows the flow of a process ol 
MtiltTi'ng a decision matrix. 

Figs. lOA and lOB show the flow 
eiqiression of a gene by comparing baseline scan data and 

Fig. 1 1 shows ibc flow of a process 
monitoring the expression of genes after the 

Figs. 12a and 12b illustrate the 
systeoL Fig. 12 generally illustrates the various componei|ts 
oligonucleotide/ligation reaction system. Fig. 
perfectly complementary targetioligonudcotide hybrids uding the probe 
oligonucleotide/ligation reacdon system. 

Figs. 13a, 13b, 13c, and 13d illustrole the 
ligation/l^bridization reactions and illustrates various ligabon 
illustrates various components of the ligation/hybridizatioi i 
optional in various embodiments. Fig. 13b illustrates a lig itii 
mismatches at the terminus of the probe oligonucleotide, 
strategy that discriminates mismatches at the tennimis of t|ie 
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IS minutes. The bright rows 
were unambiguously detected 

A total of 21 murine RNAs 
OOOto 1:100. The cross in the 
at the top contain probes 
added to all samples, 
used to execute the software 



rofaprooss 



sof mcnasing 



; number of pn bes 



of determining the 
scperimental scan data. 

the number of probes for 
has been reduced or pruned, 
probe oligonucleotide/ligation reaction 

of the probe 
12b illustrates discrimination of oon- 



vurious 



components of 
strategies. Fig. 13a 

reaction some of which are 
iion strateby that discriminates 
ig. 13c illustrates a ligation 
sample oligonucleotide. Fig. 
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] 3d illustrates a method for ixnpfoving the discrimination 
the sample terminus. 

Figs.l4a, ]4b» 14c and 14d ilhistrales a lij 
coDjunctton with a restricti<m digest of the sample nucleic 
5 recognitiQn site and cleavage pattern of SacI (a 6 cutter) 
illustrates the effect of SacI cleavage on a (target) nucleic 
a 6 Mb genome {ie., E. coli) digested with SacI and SphI 
fragments with a 5* C. Fig. 14d illustrates the 
generic difiTerenoe screening chip and dieir subsequent use 

10 appropriate ncuelic acid (Fonnat I) or the fragments are 
oligonuctetide aray and directly analyzed (Fomiat II). 

Figs. 15a, ISb, 15c, 15d,aod 15e illustrate 
DNA fiagments on a generic difieience scieenign anay. 
syathesb by levcfse transcripton of poly(a) mRNA using 

15 15b iUustratcs upstream primers for PGR reaction containii^ 
and degenerate bases (N=^A»G»C»T) at the 3* cod Fig. 15c 
first stnnd cDNA. Fig. 15d shows restiictioncfigest of PCF 
sorting of PCR products on a generic gligadonairay by thn - 
Figs. 16a, 16b, and {6c illustrate die 

20 rqilicate 2 for sample 1 and sample 2 nucleic acids. Fig. 

between replicate 1 and icplicate 2 for sample 1, the norawi 
dif fe ren ce s between replicate 1 and replicate 2 for sample 2 , 
16c plots die dififerenoes b e tw een sample 1 and 2 averaged 
Figs. 17a, I7b, and 17c illustrates the data 

25 filtered. Figure 1 7a shows the relative change in hybridizatti 
2 of sample 1 for the difiference of each oligonucleotide paii . 
replicate 1 and 2 of sample 2 for the difference oj 
filtered, and plotted the same way as in Figure 1 7A. Fig. 1 
and sample 2 averaged over two replicates for the differeno i 

30 TheratioiscalcidatedasinFig. 17A,butbasedonthe 



s t both the probe terminus and 



ion discrimination used in 
itcid. Ftg. 14ashowsthe 
H5p92 II (4 cutter). Fig. 14b 
sample. Fig. 14c illustra^ 
g enerating ~ I kb genomic 
hybridizatioi /ligation of these fragments to a 
ts probes to hybridize to the 
labeled, hybridized/ligated to the 



tiie 



:differeices 



Ha 



of each oU^ »icleotide 



iabsclute 



analysis of dififemtial disq>lay 
15a shows first strand cDNA 
anchored poIyCT) primer. Fig. 
an engineered restrictionsite 
s^ws randomly primed PCR of 
products, and Fig. I5e shows 
5'end. 

between replicate 1 and 
shows the differences 
cell line. Fig. 16b shows the 
the tumor cell line). Figure 
)ver die two replicates. 

16A,16b,and!6c 
ion intensities of replicate I and 
Fig. 1 7b shows the ratio of 
pair, nonnalized, 
c shows the ratio of sample I 
of each oligonucleotide pair, 
value of 
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[(X,u.+X^J/((X„„+X«^] and [(X„„+X.^)/21/[(:f- 
as in Fig. 16c 

Fig. 18 illustrates post-fiBgmeotation labeling 
Fig. 19 provides a schematic ilhistraiioo 

on a higb density oligonucleotide aiiay. 

Fig. 20 provides a schematic illustration ei 

of a high density anay prior to hybridization and end labe ing. 
Fig. 21 illustrates the results of a measui 

labeling call accuracy. 

Fig. 22 illustrates oHgo dT labeling on a 
Fig. 23 illustrates various labeling reagents 

disclosed herein. Fig. 23a shows various labeling reagent ;. 

labeling reagents. Fig. 23c shows non-ribose or i 

23d shows sugar>modified nucleotide analogue labels 23d 
Fig. 24. illustrates resequendiig of a target 

generic n-mer tiling prc^ws. 

Fig. 25 illustrates four tiling amays presen 
Fig. 26 illustrates base calling at the 8th 
Fig. 27 illustrates a base vote table. 
Fig. 28 illustrates the effect of flying 



using a CIAP treaimenL 
oi po&-hy bridizatton end labeling 

d-labeling utilizing pre-reaction 
ing. 

< if post-hybridization TdTase end 

hi ^ density oligonucleotide array, 
suitable for use in the methods 
;. Fig. 23b shows still other 
ibose-containing labels. Fig. 



Fig. 29 illustrates nuitation detection by 
Fig. 30 illustrates bubble 



fonnatioQ detectn in of mutation 



genome. 



iHlV 



Fig. 31 illustrates induced difference neareft 
Fig. 32 illustrates mutations found in an 

ligation GeneChip^ and induced dififercsce analysis. 

Fig. 33 illustrates mutation detection using 

reference target and a sample target 



+XtK2y^] after normalization 



DN A molecule with a set of 

on a 4-mer generic array, 
in the target 



coi rectness score transform to HIV 



comparisons. 

in the HIV 



> neighbor probe scoring. 
PCR target (B) usixig a generic 
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igral 



Hit 



L Expression Monitoring and Generic Diffi 

This invention provides methods of express] 
difference scxecning. The tenn expression monitoring is 

5 of levels of expression of particular, tyjncally preselected, 
embodiment, the expression monitoring methods of this 
arrays of oligonucleotides selected to be complementary to 
the gene or genes whose expression levels are to be detectejl 
hybridized to the arrays and the resulting hybridization si; 

10 level ofexpressionofeach gene of interest Because of the 
redundancy (typically there are multiple probes per gene) 
methods provide an essentially accurate absolute 
comparison to a refierence nucleic acid. 

In anodter embodiment, this invention provi 

15 methods, that identify differences in (he abundance 

acids in two or more nucleic acid samples. The generic di: 
involve hybridizing two or more nucleic acid samples to 
oligonucleotide amy, or to different high density ol 
oligonucleotide probe composition, and optionally the sam^ 

20 distribution. The resulting hybridizations are then com] 
nucleic adds differ in abundance (concentration) between 

Where the concentrations of the nucleic 
reflects transcription levds genes in a sbbbob^c from i^^iich 
generic difference screening melhods pennit 

25 (and by implication in expression) of the nucleic acids 

samples. The differentially {e g., over- or under) expresses 
can be used (e.g., as probes) to determine and/or isolate 
levels differs between die two or more samples. 

The generic difference screening methods 

30 contrast to the expression monitoring methods, they rcquiit 
the probe oligonucleotide composition of the array. To th( 



us od 



Screening. 
monitoring and generic 
to refer to the determination 
^enes. Inapvefcned 

utilize high density 
predetermined subsequences of 
Nucleic acid samples are 
provides an indication of the 
high degree of probe 
expression monitming 
and do not require 
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ies generic difference screening 
nucleic 
screening methods 
same array high density 
iligonucfeotide arrays having the same 
oligonucleotide spatial 
par^ allowing detemunation which 
two or more samples, 
comprising the samples 
nucleic acids are derived, the 
in transcription 
the two or more 
nucleic acids thus identified 
genes whose expression 



(concer tration) of partieuku* i 



dif ^erence s 



tbi 



tbet 



:aci(s 



tiiei 



identification of differences 



corr prismg 1 



th( se 



; advantageous in that, in 
no a priori assun^ons about 
contrary, the sequences of the 
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diifeiences 



oopending qsplications \ 



probe oligonucleotides may be random, haphazard, or anj 
oligonucleotide probes. Where the oligonucleotide probe s 
or equal to a 12 mer) the amy may contain eveiy possibli 
Despite the fact that the generic difTerenoe screening ana; 
5 since the sequence of each probe in the array is known th< t 
methods ^11 provide direct sequence infonnation rcgaidi ig 
nucleic acids in the samples. 

The cxpiession monitoring and.gCDeric 
invention involve providing an anray containing a 

10 arbitrarily selected dififerent oligonucleotide probes (prob^ 
sequence and location in the array of each dififerent probe 
(e.g. mRNA) are hybridized to the probe arrays and 

it is demoostFBted herein and in 
No. OS/529,1 15 filed on September 15, 1995 and 

15 high density oligonucleotide probe arrays provides an 
quantifying the expression of particular nucleic acids in 
The expression monitoring and difference screening metfa|9ds 
in a wide variety of cirnimstanccs including detection of 
differential gene expression between two samples (e.g., a 

20 healthy sample), screening for compositions that upregul^ 
expression of particular genes, and so forth. 

In one picfeir ed embodiment, the methods 
monitor the expression (transcription) levels of nucleic 
a disease state. For exanq)le, a cancer may be diaracteriz^ 

25 particular maiker such as the HER2 (o-crbB-2/neu) 

cancer. Similariy, ov eicxpies sion of recqnor tytorine 
the etiology of a number of tumors including carcinomas 
pancreas, as well as glioblastomas, sarcomas and squamojis 
Ann. Rev. Biochem., 56: 881-914 (1987)). Conversely, a 

30 breast) may be characterized by the mutation of or tmdere| xpfC3s i< 



large ni Ember (e.g. j 



the pi tiem of hybridization t 
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arbitrary subset of 
are shoit enough (e.g., less than 
nucleic acid of that length. 
5 might be arbitrary or random, 
generic difference screening 
the differentially expressed 



screening methods of this 
gneater than 1,000) of 
oligcmucleotides) where the 
is known. Nucleic acid samples 
is detected. 
U. S Patent Serial 
PCTAJ£(96/14839 that hybridization with 
means of detecting and/or 
complex nucleic acid populations, 
of this invention may be used 
iisease, identification of 
pathological as compared to a 
or downregulate the 



acids 



proto- orxxigene i 



of this invention are used to 
vfytosc expression is altered in 
by the overexpression of a 

in the case of breast - 
(RTKs) is associated with 
>f the breast, liver, bladder, 

carcinomas {see Carpenter, 
cancer (e.g., colerectaU lung and 
ion of a tumor suppressor 
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gene such as P53 {see, e.g.^ Tominaga ela/. Critica! Hev, in 
(1992)). 

Where the particular genes of interest are 
will preferably contain probe oligonucleotides selected to 
5 seqtie nc es or su b sequences of those genes of interest High 
gene of interest can be achieved and absolute expression 
detennined. 

Conversely* where it is unknown which gene 
the healthy and disease state the generic difference screening 
10 paxticularly appropriate. Hybridizationof die healthy and 
generic difference screening arrays disclosed herein and 
patteins identifies those genes whose regulation is altered in 
Similarly, the 

methods of this invention can be used to monitor expression 

IS to defined stimuli, such as a dnig» cell activation, £/<:. The 
advantageous because they permit si 
numbers of genes. This is especially useful in dnig 
a complex one, not simply asking if one particular gene is 
underexpressed. Thus, where a disease sta(e or the mode 

20 characterized, the methods of this invention allow rapid 

relevant genes. Again, utee the gene of interest is known 
monitoring methods will prefembly be used, while generic 
when the particular genes of interest are unknown. 

Using the generic difference screening metho<bs 

25 knowledge regarding the particular genes does r»Dt prevent 
therapeutics. For example, if the hytmdistion pattern on a 
for a healthy cell is known and significantly different fiom 
then libraries of compounds can be s creen ed for those that ci 
cell to become like that for the healthy cell. This provides a 

30 cellular response to a drug. 



conapansonc 



» expression monitoring and gem ric difference 



irethodsf 



Oncogenesis, 3: 257-282 

£ kiK wn, the high density arrays 
be complementary to the 

irobe redundancy for each 
levels of each gene can be 

differ in expression between 
medK>ds of diis invention are 
p^ological nucleic acids to the 
of the hybridization 
[he pathological state. 

screening 
of various genes in response 
are pardculariy 



i simultaneous monitoring i if the expression of largt 



t research if the 



deteminatio 



end point description is 
ov crexpresscd or 
of Action of a drug is not well 
ion of the particulariy 
suspected, expression 
s(|reening methods will be used 

disclosed herein, lack of 
identification of useful 
( articular high density array 
tlje pattern for a diseased cell, 
the pattern for a diseasf^ 
^ery detailed measure of the 
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)thU5 



veiy detailed c 



thit 



Generic diffetence screening methods 
discovety and for elucidating mechanisms underiying co|npl< 
various stimulL For example, in one embodiment, genei c 
used for "expression fingerprinting". Suppose it b foink 
5 cell type displays a distinct ovetall hybridization pattern 
conditions (e,g. when harboring mutations in particular 
this pattern of expression (an expre ss ion fingerprint), il 
differentiable in the different cases can be used as a 
required that the pattern be fully interpietable; but just 

10 state (and preferably of diagnostic and/or prognostic rele 
Both expression monitoring methods and 
also be used in drug safety studies. For example, if one 
should not significantly afifect the expression profile for 
hybridization pattern could be used as a detailed measure 

15 In other words, as a toxicological screen. 

The expressi(Xi monitoring and generic 
mvention are paitxculariy well suited for gene discovery, 
the generic diffetenoe screening methods identify differe^ices 
acids in two or more samples. These differences may 

20 levels of previously unknown genes. The sequence 

screening anay can be utilized, as described herein, to id^tify 

The expression monitoring methods can 
exploiting the tact that many genes that have been disco^jered 
into fiunilies based on ccHnmonality of the sequetKCS. Because 

25 mmiberofprobes it is possible to place m the high density 

oligonucleotide probes representing known or parts of kr own 
class. In utilizing such a **chip" (high density array) gene s 
give a positive signal at loci containing both variable and 
genes, only the common regions of the gene fiunily woul 1 

30 lesQh would indicate tiie possibility of a newly discovere i 



provide a powerAil tool for gene 
lex cellular responses to 
difiference scfcening can be 
that the mRN A from a certain 
lhat is different under different 
genes, in a disease state). Then 
if r sproducible and clearly 

diagnostic. Itisnoteven 
it is q)ecific for a particular cell 
^ce). 

generic difference screening may 
making a new antibiotic, then it 
i|MmmaKan cells. The 
of the effect of a drug on cells. 

di^erences 



mcicate 
infbr nation 



screening methods of this 
For example, as aqrfained above, 
in abundances of niiddc 
changes in the expression 
provided by a difference 
the unknown gene, 
used in gene discovery by 
to date have been classified 
of the extremely large 
array, it is possible to include 
members from every gene 
that are already known would 
common regions. For unknown 
give a positive signal. The 
gene. 
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difft rence i 



genet 



Thousands 



The e3q>ression monitoring and generic 
inventioii thus also allow the development of "^dynamic' 
Genome Project and oommetcial sequeodng projects have 
w^cb list tfiousands of seqwcnces without xegard to fiinctioL) 
5 Analyses using the methods of this invention produces *'dy]i^^ 
gene's fiincticm and its interactions with other genes. Witb >ut 
expression of large numbers of genes simultaneously, or th< 
abtmdanoes of large numbers of "unknown" nucleic acids 
creating such a database is enomwus. 

10 The tedious nature of using DNA sequence 

expression pattern involves preparing a cDNA library from 
cells of interest and then sequencing the litoary. As the 
lists the sequences that are obtained and counts tbem. 
to be detennined and then the frequency of those gene 

15 esqiiesaon pattern ofgenes for tbe cells beiQg studied. 

By contrast, using an expression monitoringJ or 
screening, array to obtain the data according to the methods 
frot and easy. For exam]^ to in one embodiment, cells may 
expression. The RNA is obtained from the cells and then 

20 copy is created. Fluorescent molecules may be I 

polymerizatioiL Either the labeled RNA or the labeled cD^A 
density amy in one ovemigbtesqieriment The hylffxdizati( 
assessment of the levels of every sii^e one of the hybridizeld 
additional sequencing. In addition the methods of this invention 

25 allowing a few copies of expressed genes per cell to 

demonstiated in the examples provided herein. Theseuses 
invention are intended to be illustiative and in no manner liibiting. 



screening methods of diis 
databases. The Human 
generated large static databases 
or genetic interaction. 

databases diat define a 
the ability to monitor the 
abilito to detect differences in 
simultaneously, the work of 

I nalysis for determining an 
the RNA isolated from the 
is sequenced, the operator 
of sequences would have 
would define the 



i sequences 



i tncorporatec during 



»bedete:ted. 
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generic difference 
of this invention is relatively 
be stimulated to induce 
ei^ labeled directly or a cDNA 
the DNA 
is then hybridized to a high 
provides a quantitative 
nucleic adds with no 
are much more sensitive 
This procedure is 
I »f the methods of this 



30 



//. High Density Arrays For Generic Dijferena : Screening and 
Expression Moniioring. 
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forthequamiiicatio 



f implic adon. 



fh^terogc 



As indicated above, this invention provide s 
(detecting and/or quantifying) the expression levels of a 
and/or dcterauning di fle ieocgs in nucleic acid concentratf ons 
more samples. The methods involve hybridization of oi 
5 (target nucleic acids) to one or more high density arrays 
quantifying the amount of target nucleic acids hybridized 
While nucleic acid hybridization has been 
the expression levels of various genes (eg.. Northern Bio t), 
this invention that high density arrays are suitable 

10 variations in abimdance {eg., transcription and, by i 

acid {e.g., gene) in the presence of a large population ol 
signal {e,g., paiticular gene or gene product* or 
present at a conoentiation of less than about i in 1,000, 
concentration less than 1 in 10,000 more preferably less 

IS preferably less than about 1 in 100,000, 1 in 300,000, or 
The oligonucleotide arrays can have 
nucleotides, more preferably 15 oligonucleotides and roo^ 
oligonucleotides are used to qiecifically detect and 
Where ligation discrimination methods are used, the ol 

20 shorter oligonucleotides, in this instance, oligonucleotide 
oligonucleotides ranging in length from 6 to 15 nucleotidi s, 
to about 1 2 nucleotides in length are preferred. Of course 
oligonucleotides, as described herein, are also suitable. 

The expression monitoring anrays, which 

25 preselected genes, provide for simultaneous monitoring 
least about 100, more preferably at least about 1000, still 
10,000, and nmst preferably at least about 100,000 differeht 



rdiffeientitlly 



,aid 

t] lan 
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methods of monitoring 
] arge number of nucleic acids 
(abundances) between two or 
or more a nucleic acid samples 
(|f nucleic acid probes and then 
to each probe in the anay. 
used for some time to determine 
it was a surprising discovery of 
ion of the small 
expression) of a nucleic 
;enous nucleic acids. The 
abundant nucleic acid) may be 
is often present at a 
about 1 in 50.000 and most 
1 1 in 1,000,000. 

as short as 10 
prefenbly20or2S 
nucleic acid expression levels, 
arrays can contain 
arrays comprising 

more preferably from about 8 
arrays containing longer 



oligoi lucleotides s 



I quantify 

ligc nculeotidc s 



ar 



o: at 



designed to detect particular 
least about 10, preferably at 
1 nore preferably at least about 
genes. 



30 



A) Advantages of Oi^onadeotide Arrays. 

In one prefenred embodiment, Ae high den dty arrays used m the methods of 
this invention comprise chemically synthesized oligonucK otides. The use of cfaenucally 
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ssmthesized oiigonudeotide arrays, as opposed to, for exan pie, 
clones, restriction fragments, oligonucleotides, and the like 
These advantages generally &1I into four categories: 

1 ) EfiBcicDcy of production; 

2) Reduced intra- and inter-array vaiiafa|iUty; 

3) Increased infomiation content; and 

4) Improved signal to noise ratio. 



1) EJSpdaMfofpFodueihn, 
In a pr e fer red embodiment, the arrays are 
spatially addressed parallel synthesis {see, e.g.. Section V, 
are synthesi2ed chemically in a highly parallel &shion 
suiftce. This allows cxtremdy efifident amy production, 
any collection of tens (or even hundreds) of thousands oi 
oligonucleotides are synthesized in fiswer than 80 synthesis 
and synthesized based on sequence information alone, 
array preparation requires no handling of biological 
steps, nucleic add purifications or amplifications, cataloging 
products, and the like. The preferred chemical synthesis ol 
arrays in this mvention is thus more efHcient than blotting 
production of highly reproducible higb-<lensity arrays. 



Thus 



Imateris Is. 



2) Reduced iiiiro- ami itUer<-amtyvariabiliyi 

The use of chemically synthesized high-den: ity oligonucleotide 
methods of this invention improves intra- and inter-array vj 
arrays preferred for dus invendon are made in large batches 
with multiple wafers ^iilhesi2Bd in parallel) in a highly coi itroUed reproducibli 
This makes them suitable as general diagnostic and researcl i 
comparisons of assays performed at tififerent times and 

Because of the precise control obtainable 
anays of this invention show less than about 25%, preferabty 



blotted arrays of genomic 
offers numerous advantage. 



sy ithesized using methods of 
lelow). The oligonucleotides 
covajlently attached to the array 
'or example, arrays containing 
f s|>ecilically selected 20 mer 
cycles. The arra)rs are designed 
unlike blotting methods, die 

There is ik> need for cloning 
of clones or amplification 
>f[h]gh density oiigonudeotide 
r lethods and permits the 



arrays in the 
'friability. The oligonucleotide 
(presently 49 arrays per wafer 

le maimer, 
tools permitting direct 



locadons. 
duing 



the chemical synthesis the 
less than about 20%, more 
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tian 



ithin 



r vanat on 



nucleic 



{Rcferably less dian about 15%, still more preferably less 
preferably less than about 5%, and most preferably less 
high density arrays (within or between productioo batches] 
composition. Array variation is assayed as the variation 
a labeled control target nucleic add mixture) in one or n 
between two or more arrays. More preferably, array 
hybridization intensity (against a labeled control target 
one or more target gextes between two or more arrays. 

in addition to reduciiig inter- and intra-arraj 
synthesized arrays also reduce variations in relative probe 
methods* particulariy ^tting mediods diat use celt-deriveji 
Many genes are expressed at the level of thousands of copi s 
expressed at only a single copy per cell. AcDNAIibraiy 
will a cDNA library made from diis material. While 
amount of each different probe e.g., by comparison to a 
reduce the representation of over-expressed sequences to 
been shown to lessen the odds ol 
of2or3. In contrast* chemical synthesis methods can ii 
probes are represented in af^noximately equal concentratio is, 
gene (intra-anay) variability and pennits direct comparison 
for different oligonoucleotide probes. 



vfiU 



norm lization 



Tcf Tcnce < 



seme 



of selecting hi^ily expresset I cDN As by only 



: arrays 



S) Increased mformasion content 

Q Advantages/or expression monitJ^ring. 

The use of high density oligonucleotide 
provides a number of advantages not found 
large numbers of different probes that specifically bind to 
particular target gene provides a high degree of redundancy 
pennits optimization of probe sets 
minimizes the possibiiity of errors due to crosweactivity 



I with other mettods. 



tiiet 



i for effective detection ol particular 



about 10%, even more 
about 2% variation between 
having the same probe 
hybridization intensity (against 
B oligonucleotide probes 

is assayed as the variation in 
acid mixture) measured for 



variability, chemically 
3 requency inherent in spotting 
nucleic acids {e.g., cONAs). 
per cell, while others are 
reflect this veiy large bias as 
(adjustment of the 
cDNA)oftheUbraryvriI] 
extent, normalization has 
abouta&ctor 
that all oligonuclcodde 
This decreases the inter* 
between bbybridization signals 



for expression monitoring 
For example; the use of 
transcription product of a 
and internal control that 
target genes and 
vi itfa othernucleic add species. 



wo 9707317 



PCrAJS97/l»1603 



28 



10 



IS 



20 



25 



30 



inef Tective i 



ubseqiieni 



Apparently suitable probes often prove 
monitoring by hybridization. For example, certain su] 
may be found in other regions of die genome and probes 
will cross-hybridize with the other regions and not provide 
measure of die oqxesston level of the target gene. Even 
reactivity may be unsuitable because they generally show 
formation of structures that prevent efifective hybridization 
numbers of probes, it is difiicult to identify hybridization 
the probes in a set Because oftfae high degree oi 
number of probes for each target gene, it is possible to 
pooriy under a given set of hybridization conditions and 
particular target gene to provide an extremely sensitive and 
expression level (transcription level) of that geoe. 

In addi tio n, the use of large numben 
makes it possible to monitor expression of fiunilies of closely* 
probes may be selected to hybridize both with subsequence s 
fimily and with subsequences that differ in the different ni cleic s 
hybridization with such arrays permits simultaneous mcmit mng 
a gene femily even ^^lere the various genes are approximad ely 
levels of homology. Such measurements are difficult or i 
hybridization methods. 



forexpresaon 
:ces of a particular target gene 
djrected to these subsequences 
a signal that is a meaningful 
obes that show Itttie cross 
ifoor hybridization due to the 

Finally, in sets with large 
cbnditions that are optimal for all 
by the large 
those probes that function 
retain enough probes to a 
reliable measure of the 



f redundai cy provided I 



elininatet 



t still 



i of din irest 



isucli 



Because the high density arrays contain 
possible to provide numerous controls including, for exam^i 
mutations in a particular gene, controls for overaU 
sample preparation conditions, controls 
nucleic acids are derived and mismatch controls for 
hybridization. 

Effective detection and quantitation of gene 
mammalian ceil message populations can be detemiiiied 



s for metabolic activ tty of the cell 



rnon-^Bcifii 



probes to each target gene 
related nucleic acids. The 
that are conserved across tt^ 
acids in the fiunily. Thus, 
of the various members of 
the same size and have high 
iidpossible with traditional 



a large number of probes it is 
le, controls for variations or 
for 

from which the 
»fic binding or cross 



hybridiz ation conditioos, controls f 



transcription in ooiiq)lex 
with relatively short 
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frcn 



scquaice 



(thin] ^ 



oligonucleotides and with relative few {e.g., fewer than 4( 
preferably fewer than 25, and most preferably fewer than 
oligonucleotide probes per geae. Hicre are a laige number 
strongly and specifically for each geae. Thisdoesnot 
is required for detectitm, but rather that there are many 
choices can be based on other considerations such as 
checking for splice variants, or genotyping hot ^ts (i 
spotting methods). 

In use, sets of four anays for expression 
^T^HOximately 400,000 probes each. Sets of about 40 
that are ooropiementaiy to each of about 40,000 genes for 
public database. This set of ESTs covers looghlyone-thir^ 
and these anays will allow the levds of all of tbem to be 
ovendght fayfaridizalioiis. 



vcinssrtmm 

preferably fewer than 30, more 
>0, 15, or even 10) 
of probes which hybridize both 
that a large number of probes 
which to choose and that 

uniqueness (gene families), 
not ea«ly done with cDNA 



OH nitoring i 



prolies 



are made that contain 
(20 probe pairs) arc chosen 
vhich there are ESTs in the 
to one-half of all human genes 
nkonitored in a parallel set of 



I ton ic. 



:acidi to 



4) Improved M^gnai to noise ratio. 

Blotted nucleic acids sometimes rely on i( 
hydrofdiobic interactions to attach the blotted nuddc 
formed at multiple points along the nucleic acid restricting 
interfering with the ability of the nucleic add to hybridize 
contrast, the preferred arrays of this invcntiiHi are 
oligonucleotide probes are attached to the substrate by a 
The probes have more degrees of freedom aiKi are capable 
interactions with their complementary targets. Consequenjly, 
invention show significantly higher hybridization 
even 1000 times more effident) than blotted arrays. Less 
produce a given signal thereby dramatically improving the 
Consequently the methods of this invention permit detectic n 
nucldc add in extremely complex nuddc add mixtures. 



I effidenc ies 



chemical ly synthesized, 
diglet 



electrostatic, and 
the substrate. Bonds are 
degrees of freedom and 
o its complementary target In 
The 

terminal oovalent bond. 
Df participating in complex 
the probe arrays of this 
(10 times, 1 00 times, and 
1 arget oligomideotide is used to 
signal to noise ratio, 
of only a few copies of a 



B) Pr^erredHigh Dea^Am^ 
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Preferred high density airays of this iaventi(>n 
1 00, preferably greater than about 1 000, more preferably 
most preferably greater than about 65,000 or 250,000 or < 
difioent oligonucleotide probes. The oligonucleotide profales 
5 SO or about S to about 45 nucleotides, more preferably fion I 
nucleotides and most preferably from about 1 5 to about 
particular preferred embodiments, the oligonucleotide probes 
length, while in other preferred embodiments (particularly 
reactions are used) the oligomideotide probes are preferablV 

10 preferably 8 to 1 5 nucleotides in length). It was a discover r 
short oligonucleotide |»obes sufficient to specifically hybri 
sequences. Thus in one pieferr ed embodiment the ol 
nucleotides in length, generally less than 46 nucleotides, 
nucleotides, most genetally less than 36 nucleotides, 

15 more preferably less than 26 nucleotides, and most preferal[ly 
length. The probes can also be less than 16 nucleotides, 
less than 9 nucleotides in length and less than 7 nucleotide! 
that the oligonucleotide probes can be relatively long, 
nucleotides, more typically up to about 500 nucleotides in 

20 The location and, in some embodiments, 

oligonucleotide probe in the array is knowtL Moreover, thi ; 
probes occiqnes a relatively small area providing a 
density of generally greater than about 60, more 
generally greater than about 600, often greater than about 

25 about 5,000, most often greater than about 1 0,000, 

more preferably greater than about 100,000, and most 
400,000 difiereat oligonucleotide probes per cm'. The 
(often less than about 10 cm', preferably less than about 5 
about 2 cm', and most preferably less than about 1.6 cm^ 

30 volumes and extremely uniform hybridization conditions 



^ligonu :leotide 



m>re< 



, |xefef ibly 
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comprise greater than about 

gjeater than about 16.000 and 
greater than about 1,000,000 
range from about 5 to about 
about 10 to about 40 
40 |:iucIeotides in length. In 
are 20 or 25 nucleotides in 
vhere ligation discrimination 
shorter {e.g., 6 to 20 more 
of this invention that relatively 
lize to and distinguish target 
le probes are less than 50 
generally less than 41 
less than 31 nucleotides, 
less than 21 nucleotides in 
i than 1 3 nucleotides in length, 
in length. It is also recognized 
in length up to about 1000 
length. 

of each different 
large mmibei of different 
array having a probe 
about 100, most 
1|000, more often greater than 
greater than about 40,000 
prefi^rably greater than about 
surface area of the array 
nn' more preferably less than 
] )crmits the use of small sample 
( emperature regulation, salt 



sttfuencec 



ihighdesity 
generally i (reatertlian i 



, preferal >ly 



;sm;kll 
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content, etc.) while the extremely large number of probes allows massively parallel 
processing of hybridizatioDs. 

Finally, because of the small axca occupiejl 
hybridization may be cairied out in extremely small 
more preferably 100 ^] or less, and most preferably 10 \i\ 
hybridization conditions are extremely uniform througbo it 
hybridization format is amenable to automated processin| 



fluid volumes 



:acid 



expressionc 



RllAt 



trai 5cript(5); 



sofhigh lensity 



in. MonUormg Gene Expresswn and GenmhDifft 

As explained above, this invention providf s 
expression (e?q>ression monitoring) and for identifying 
(ooDoentiation) of nucleic acids in two or more nucleic 
screening}. Genenlly the methods of monitoring gene 
(1) piavidiiig a pool of target nucleic adds oomprisiiig 
target gene(s), or nucleic adds derived from the RNA 
nucleic add sample to a high density array of probes (including 
detecting the hybridized nucleic adds and calculating a rqlative 
level. These methods preferably involve the use 
containing probes to specifically presdected genes. 

In contrast, the arrays used in the generic 
this invention do not require that specific target genes be 
methods are designed to detect changes or differences in 
the particuhtf gene to be identified is unknown prior to 
screening. 

The methods of generic difference screeniikg 
1} providmg one or more high density oligonucleotide 
pairs differing in one or more nudeotides); 2) providing 
3) hybridizing the nucldc acid samples to one ot more 
b e twe en nucleic acids in the nucldc acid samples 
anay(s); 3) detecting the hybridization of the nucldc addk 
determining the differences in hybridization between the i lucldc acid 



aiT lys 
tvo 
m\ys 
I and pro >e 



by the high density arrays, 

(e.g., 250 |il or less, 
or less). Inaddition. 
the sample, and the 



Ijference Screening. 
methods for monitoring gene 
di|Terences in abundatKe 

samples (generic difference 

of this invention involve 
trBiiscript(s) of one or more 
(2) hybridizing the 
control probes); and (3) 
expression (transcription) 
oUgonudeotide an^s 



c iffcrence screening methods of 
dentified. To the contrary, the 
i xpression of various genes where 
pe -forming the difference 



typically involve the steps of: 
(preferably including probes 
or more ruiddc add samples; 
to form hybrid diqilcxes 
oligonucleotides in the 
to the arrays; and 4) 
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The provision of a nucldc acid sample; the 
the anays, and detection of the hybridized nucleic acid(s) is 
same manner in eaqnession monitoring and in generic 
disclosed herein, in preferred embodiments, the methods are 
oligonucleotide probe selection, in the use of at least two 
diffeienoe screening, and in subsequent analysis. 



h f bridizalion of the sample to 
performed in essentially the 
screening methods. As 
distinguished, in part, by 
acid samples in generic 



differ rncc 



nu deic i 



cor centration I 



I transcription 



;acid 



sevcds 



i tianscri])ed 



A) Pnmding a Nucleic Aad Sample. 

In Older to measure tibe nucleic add concentrfition 

10 desirable to provide a nucleic acid sample for such analysis, 
nucleic acid concentration, or differences in nucleic add 
samples, reflect transcription levels or differences in 
it Is desimble to provide a nucleic add sample comprising 
or genes, or nucldc adds derived fiom the mRN A transcripts). 

IS acid derived from an mRNA tianscripc refers to a nucleic 
mRN A transcript or a subsequence thereof has ultimately 
cDN A reverse transcribed from an mRNA, an RNA 
amplified from the cDNA, an RNA transcribed from 
derived from the mRNA txanscript and detection of such derived 

20 the presence and/or abundance of the original transcript in a 
samples indude, but are not limited to, mRNA transcripts 
reverse transcribed from the mRNA, cRNA transcribed fron 
from the genes, RNA transcribed from amplified DNA, 

In a paiticulariy preferred embodiment, wher^ 

25 transcription level (and thereby expression) of a one or more 
acid sample is one in which the concentration of the mRNA 
genes, or the concentration of the nuddc adds derived fron 
pn^itiooal to the transcription level (and therefore express ion 
Similariy, it is preferred that the hybridization signal intensi y 

30 amount of hybridized nuddc add. While it is preferred tha 
rdativety strict (e.g.. a doubling in transcriptiott rate results 



»and the 



m a sample, it is 
Where it is desired that the 
between difiTeient 
levels of a gene or genes, 
njRNA tr8nscript(s) of the gene 
As used herein, a nudeic 
for whose synthesis the 
as a template. Thus, a 
from that cDN A, a DN A 

e/c.areall 
products is indicative of 
sample. Thus, suitable 

gene or genes, cDN A 
the cDN A, DN A amplified 
like. 

it is desired to quantify the 
genes in a sample, the nucleic 
transcript(s) of the gene or 
the mRNA tianscript(s), is 
level) of that gene, 
be proportional to the 
the propoitionality be 
n a doubling in mRNA 



the ami »lified DNA, e/c, are i 



oi thet 
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transcript in the sample nucleic acid pool and a doubling 
skill will appreciate thai the piopoitionality can be more 
Thus» for example, an assay whae a 5 fold difiference in 
results in a 3 to 6 fold difference in l^bridtzation iniensit> 
5 Where more precise quantification is required sqipropriate 
variations intnxhiced in sanqrie preparation and hybridiza^< 
addition, serial dilutions of "standard" target mRNAs can 
curves according to methods well known to those of skill 
simple detection of the presence or absence of a transcript 

10 in nucleic acid concentration is desired, no elaborate 

In the simplest embodiment, such a nucleic 
or a total cDNA isolated and/or otherpvise derived from a 
"biological sample", as used herein, refers to a 
components (eg., cells) of an organism. The sample may 

15 fluid. Frequently the sample will be a "clinical san^ile" 
patient Such samples include, but are not limited to, 
white cellsX tissue or fine needle biopsy samples, urine, 
or cells therefiom. Biological samples may also include 
sections taken for histological purposes. 

20 The nucleic acid (either gemmic DNA or 

sample according to any of a number of methods well 
One of skill wiU appreciate that where alteretioos in the 
detected gefu>mic DNA is preferably isolated. Conversely 
gene or genes are to be detected, prefeiaUy RNA (mRNA) 

25 Mettiodsofisokiting total mRNA are well 

For example, methods of isolation and purification 
in (Chapter 3 of Laboratory Techniques in 
Hybridization With Nucleic Acid Probes, Part J. 
Tijssen, ed. Elsevier, N.Y. (1993) and Chapter 3 

30 Biochemistry and Molecular Biology: HybridtalUm 
Theory and Nudeie Add Preparation, P. Hjssen, ed. 



isanqileobuined 
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spub im. 
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hybridization signal), one of 

relaxed and even non-linear. 

in of the target mRNA 
is sufficient for most purposes, 
controls can be run to conrect for 
ion as described herein. In 
>e used to prepare calibration 

the art Of course, where 
or large differences of changes 

or calibration is required, 
acid sample is the total mRNA 
I iological sample. The term 

from an organism or from 
)e of any biological tissue or 
is a sample derived from a 
blood, blood cells (e.g., 
fluid, and pleural fluid, 
s4ctions of tissues such as frozen 



n|RNA) may be isolated from the 
to those of skill in the art 
number of a gene are to be 
yAiSTc e xpre ss ion levels of a 
is isolated. 
Itnown to those of skill in the art ■ 
adds are described in detail 



Biochemistry an i JMecular Biology: 

Theory a td Nucleic Acid Preparation, P. 
ofLabon tory Techniqu 



'ues m 

mth]Nucleic Acid Probes, Part I 
;N.Y.(1993)). 
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In a prefaied embodiment, the total nucleic 
sample using, for example, an acid guanidinium-j 
polyA' mRNA is isolated by oligo dT column chromalograj hy 
beads {see, e.^, Sambroolc et aL, Molecular Chning: A 
5 Vols. 1-3, Cold Spring Harbor Laboratoiy, (1999), or 

Biology^ F. Ausubel et ai.^ ed. Greene Publisiiing and Wiley 
(1987)). 

Frequently, it is desirable to amplify the nucleic 
hybridization. One of skill in the ait will appreciate 

10 used, ifa quantitative result is desired, care must be taken 
or controls for the relative frequencies of the amplified 

Methods of "quantitative** amplification are 
the art For example, quantitative PCR involves sunultaneoi sly 
quantity ofa control sequence using the same primers. This 

15 that may be used to calibrate the PGR reaction. The high deijsity 
probes specific to the internal standard for quantification 

One inefecred intemal standard is a synthetic 
cRNA is comtHned with RNA isolated from the sanq>le 
known to those of skOl in the ait The RNA is then levctse 

20 transcriptase to provide copy DNA. The cDNA sequences an 
PCR) using labeled primers. The amplification products are 
electrojtoesis, and the amount of radioactivity (proportiona 
product) is detennined. The amount of mRNA in the sample 
comparison with the signal produced by the known AW106 

25 protocols for quantitative PCR are provided in PCR Protocol t. 
Applications, Innis et aL» Academic Press, Inc. N.Y., (1 990). 

Other suitaible amplification methods include, 
polymerase chain reaction (PCR) (Innis, et al., PCR Protocol t. 
Application. Academic Press, Inc. San Diego, (1990)), ligase 

30 Wu and Wallace. Genomics, 4: 560 (1989), Landegren, et al., 
and Baninger, et al.. Gene, 89: 1 17 (1990), transcription 



i tctd is isolated from a given 
phenol-chloroform extraction method and 
or by using (dT)n magnetic 
LaB^atory Manual (2nd ed), 
in Molecular 
Interscicnoe, New York 



that wfai tever amplification t 



1 nucle ic 



weUl 



I of t te 



sacGoading 



acid sample prior to 

method is 
a method that maintains 

acids. 

known to those of skill in 
co-amplifying a known 
irovides an internal stBXxlard 
array may then inchide 
an^>lificd nucleic acid. 
AW106cRNA. TheAW106 
to standard techniques 
ti|anscribed using a reverse 
then amplified (e.g., by 
iq»rated, typicaUy by 
to the amount of amplified 
is then calculated by 
standard. Detailed 
A Guide to Methods and 



INAs 



tMit are not limited to 

A guide to Methods and 
chain reaction (LCR) (see 
5c/eiice,241: 1077(1988) 
et al., Proe. 



amp lification (Kwoh, c 
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.tic 



iptunerconisttng 



; strand xl 



After 



ttFBttsoibed 



desribed 



srelaive 



Natl. Acad Scl USA, 86: 1 173 (1989)). and self.sustaincjd 
etal, Proe. Nat. AauL Set, USA, 87: 1874 (1990)). 

In a particularly preferred embodiment, 
tFBDScribed with a reverse txattscriptBse and a 
5 enooding the i^iage T7 promoter to provide single 

DNA strand is polymerized using a DNA polymerase. 
cDN A, T7 RNA polymerase b added and RN A is 
Successive rounds of transcription from each single cDNVV 
RNA. Methods of <R Wl>t> polymerization are well knowp 

10 e.g. , Sambrook, stgrra.) and this particular method is 
aL, Proc NatL Acad. Scl USA, 87: 1663-1667 (1990) 
anqtlificatioa aoooidiiig to this method preserves die 
RNA ttanscr^xs. Moreover, Eberwine et at Proc. NatL 
provide a protocol that uses two rounds of an^Iification 

15 achieve greater than 10^ fold amplification of the origin^ 
permitting expression monitoring even viben biological 
It will be appreciated by one of skill in thi 
mediod described above provides an antisense(aRNA) p k>1 
used as the target nucleic acid, the oligonucleotide probe 

20 to be complementary to subsequences of the antisense 
target nucleic add pool is a pool of sense nucleic 
selected to be complementary to subsequences of the set 
die nucleic acid pool is double stranded, the probes nuty 
nucleic adds include both sense and antisense strands. 

25 The protocols cited above indude method^ 

sense or antisense mideic adds. Indeed, one 8p|XDach 
or antisense nuddc adds as desired. Forexanq;>le, 
into a vector (e.g., Stratagene's p Bluscript n KS (+) pha^i 
the T3 and T7 promoters, vziro transcription with the 

30 ofone sense (the sense depending on the orientation 

transcription with the T7 polymerase will produce RNA 
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sequence replication (Guatdli, 



sample mRNA is reverse 

of oligo dT and a sequence 
DNA template. The second 
synthesis of double-stranded 
from the cDNA template, 
template resuhs in amplified 
to those of skill in the art {see, 
in detail by Van Gclder, et 
demonstrate that in vitro 
fiequendes of the various 
icad ScL USA, 89: 3010-3014 
ria m vitro transcriptioD to 
starting material thereby 
amples are limited, 
art that the direct transcription 

Where antisense RNA is 
provided in the array are chosen 
acids. Conversely, where 
the oligonudeotide probes are 
mideic adds. Finally, where 
of dther sense as target 



nu cldc 



sadcs. 



of generating pools of dther 
I be used to generate either sense 
the cDN A can be directionally doned 
cmid) such that it is flanked by 
polymerase will produce RNA 
insert), vAaiz in vitro 
] laving the opposite sense. Other 
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stiitable cloning systems include phage lambda vectors 
subdoning {see e.g., Palazzolo etaL, Gene, 88: 25-36 (199( )) 
In a particularly preferred embodiment, a hig li 
(e.g. about 2500 umts^jiL for T7. available from ^centre 



desi ;ned for Crc-loxP plasmid 



activity RKA polymerase 
llechnologies) is used. 



nucleotides 



B) LaMmg nucleic adds. 

9 1'abeBng makods/itrategies. 

In a prefemd embodiment, the hybridized 
detecting one or more labels attached to the sample nudetc 
incorporated by any of a number of means well known to 
However, in a prefencd embodiment, the label is 
amplification step in the preparation of the sample nucleic 
chain reaction (PGR) with labeled primers labeled 
anq>lification product The nudetc add (e^. DNA) is be 
labded deoxynudeotide triphosphates (dNTPs). The 
fragmented, exposed to an oligonoudeotide anay, and the 
determined by the amount of label now associated with the 
embodiment* transcriptioii amplification, as described 
(je,g. fluoresoeiii-labeled UTP and/or CTP) incoipoiates a 
adds. 

Alternatively, a label may be added directly 
sample {e.g„ mRNA, polyA mRNA, cDNA, etc) or to the 
amplification is completed. Such labding can result in the 
products and reduce the time required for the amplification 
labels to nuddc acids include, for example nick translation 
labeled RNA) by kinasing of the nucleic add and 
nudetc add linker joining the sample nuddc add to a labd 
labeling is discussed in more detail below in Section 

Detectable labels suitable for use in the 
composition detectable by spectroscopic, photochemical, bi 
dectrical, optical or chemicd means. Useful labels in the 
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nu cleic acids are detected by 

tcids. The labels may be 
th( >se of skill in the art 
simultaneo isly incorporated during the 
atjids. For example, polymerase 

will provide a labeled 
an plified in the presence of 
amplif ed nudeic add can be 
e)jtent of hybridization 
ahay. Inaprefentd 
above using a labeled nucleotide 
iabpl into the transcribed nucleic 



the original nuddc add 
tpiification product after the 
iijcreased yield of ami^ification 
n action. Means of attachiiig 
end-labeling (e.g. with a 
of a 

ie.g.,afluoroplxne). End 



subsequent attachment Gigation) < 



pit sent 



invention include any 

inununochemical, 
invention indude biotin 



bio ihemical, 
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for staining with labeled strcptavidin conjugate, magnetic 
fluoiescent dyes {e,g., fluorescein, texas red, liiodamine, 
like» fee, e.^.. Molecular Probes, Eugene, Oregon, USA), 
or en z yme s (e.g., horse radish peroxidase, alkaline 
5 used in an ELISA), and colorimetric labels such as 
40 *80 mn diameter size range scatter green light with 
plastic (e.g., polystyrene, polypropylene, latex, etc) bead^. 
labds include U.S. Patent Nos. 3,817,837; 3,850,752; 3,^ 
4;275.149; and 4^66^41. 

10 A fluorescent label is preferred because it 

low background. It is also optically detectable at 
quick scanning procedure. The nucleic add samples can 
e.g:.a siqgle fluorescent label. Alternatively , in 
saii4>les can be simultaneously hybridized %^tere each 

15 label For instance, one target could have a green 

could have a red fluorescent label. The scanning step wil 
the red label from those binding the green fluorescent 
(target nudeic acid) can be analyzed independently fiom 
Suitable chromogens which can be 

20 compounds which absorb light in a distinctive range 

observed or, altemativdy, which emit light when inadiat^ 
wave length or wave length range, e.gi, fluorescers. 

A wide variety of suitable dyes are availafati 
provide an intense color with minimal absorption by theii 

25 types indudequinotine dyes, triaiyhnethanc dyes, acndirje 
insect dyes, azo dyes, anthxaquinoid dyes, cyanine dyes, 
(dsenszoKonnun dyes. 

A wide variety of fluorescers can be 
alternatively, in conjunction with quencher nx>lecules. 

30 variety of categories having certain primaiy functionalilii 
include 1- and 2-anunonapfathalene, p,p'-dianiinostilbenej. 



beads (e.g.^ Dynabeads™), 
ptcn fluorescent protein, and the 
radiolabels (e.g., 'H, '"I. ^% '*C. 
pfiosphatase and othcn commonly 
gold {e,g., gold particles in the 
efficiency) or colored glass or 
Patents teaching the use of such 
.9139350; 3,996^45; 4,277,437; 



i empk yed 



novides a very strong signal with 
and sensitivity through a 
ill be labded with a singlelabel, ' 
different nucleic acid 
acid sample has a different 
labd and a second target 
distinguish cites of binding of 

Each nuddc acid sample 
me another. 

include those molecules and 
so that a color can be 
with radiation of a particular 



t high res olution 



another e nbodiment. 



nu :leic 



I fluoresi «nt 



t lal! el. 



; emplo] ed 

of w ivelengths s 



le, being primaiy chosen to 
surroundings. lUustradve dye 
dyes, alizarine dyes, phthaleins, 
dhenazathioniuro dyes, and 



either by alone or, 
Fljuorescers of interest fell into a 
. These primary fiinctionalities 
l^rrenes, quatemaiy 
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'benzo shenone i 



, benziinidzaoiylphi nylamine, 



compoui ds which I 



I l-amiiio-S-sulfoiiatoiu phthaleiie; 
4-acetamido-4-isothiocy2 iiato-stiIbeiie-2^'*<iisulf6iuc 



\ pboq)hatidy)ei hanol 



;merocy8iine. 



phenanthridine salts, 9-amiiioacridixie^ p»p'-diaiiuiio1 
oxacartx>cyamne, marocyanine, 3-azhinoequileniii, pctyli 
bis-p-oxazoly] benzene, 1,2-beiizophcnaziii, 
hellebrigenin, tetracydxne, steropheobl, 
indole; xanthen, T-I^dioxycoumarin^ phenoxazue, 
triaiylmelhanes and flavin. Individual fluorescent 
for linking or which can be modified to incorporate such 
dansy] chloride; fluoresceins such as 3,6-dihydroxy-9-irfiei|yl] 
riiodamineisolhiocyanate; N-phenyl 
2-amino-6-su]fonatonsq;ihthaSeoe: 

acid; pytene-3-sulfomc acid; 2-tolmdinomyhthaienc-6-sulfcnate; 
2-aininoaphthalene-6-sulfonate; ethidium bromide; stebrin<(; 
auroniine-0,2-(y-anthroyl)palmftftte; daosyl 
oxacaibocyanin^ N^-d3iexyl oxacaibocyanine; 
d-3-aminodesaxy-^quilenin; 

9'Vinylanthiacene; 2^*(viny]ene-p-|^ienylene)bisben2oxa2 >le; 
phenyl-oxazolyl)]ben2Bne; 6-diznetfaylanuno-l^' 
bis(3*-amiiiopyridimuni) 1,10-deeandiyldiiodidB; 
chlorotetracydine; N(7-dimethylaniino^metfayl-2-oxo>' 
beozimidazo)yl)-pbenyl]maleiniide; N-(4-fluoranthyl)maie|mide; 
resazarin; 4-chloro-7-nitn>-2,l^ben200xadiazDle; 
and 2,4-diphenyl-3(2H>>furanone. 

Desirably, fluoresoers should absorb light 
about 350 nm, and more prefeably above 
greater than about 10 nm higher than the wavelength of the 
noted that the absorption and emission characteristics of th i 
unbound dye. Therefore, when referring to the various 
characteristics of the ^es, it is intended to indicate the 
whidk is unconjugated and chaiacterizBd in an arbitrary solvent 



imines, anthracenes, 
bisbenzoxazolc, 
retinoid bis-3-i iminopyridiniuxn salts, 

2-oxo-3-chroiiien, 
salicyk te, strophanthidin, porphyrins, 
have fimctionalities 
functionalities include, e,g., 
[xanthhydiol; 

i; N^henyl 

ic 

N-phenyl, N>inethyl 



; I2-(9^anfliroyl)stearatc; 2-^ lylanduacene; 



l-benzo|rfie[iazin; 



sdycs 



ilamine; N J^<dioctadecyl 
4(3'pyicnyl)butyrate; 



p-bis[2-(4-methyl-5- 
retinol; 

sulfonaphthylhydrazone of hellibrienin; 
34'iiomeiiyl)maieiinide; N-bK2- 
bis(homovanillic acid); 
; Dierocya^ne 540; resorufin; rose bengal 

ak>ve about 300 nm, preferably 
about 400 nm, u ual^ emitting at wavelengths 
light absorbed. It should be 
bound dye can differ from the 

ranges and 
as cmi^oyed and not the dye 



wa releogihr 
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itiKn emitl 



: cfaemihimiiieso aice 



> comp }und. 



itle: 



Fluorescers are generally pfcfened because 
light, one can obtain a plurality of emisdoiis. T1ius,a 
plurality of measurable events. 

Detectable signal can also be provided by 
5 bioluminescent sources. Chemiluminesccat sources induce 
electromcally excited by a chemical reaction and can 
detectiUe signal or donates energy to a fluoiesccot accept^, 
of conqxnmds have been found to provide 
conditions. One femily of compounds is 23-dihydn>-l,' 

10 popular compound is luminol, which is the S-amino 

family indiide the S-amino-6,7,8-trimethoxy- and the dimkhyl 
These compounds can be made to himmenee with alkaline 
faypodilorite and base. Another fimiily of compounds is 
wilfa lo|4iine as the axnmoo name for the parent product 

IS include para-dimethylamino and -methoxy substituents. 
obtained widi oxalates, usually oxalyl active esteR» e.gi, 
hydrogen peroxide, under basic conditions. Altenatively, 
conjunction with luciferase or iudgenins to provide 
Spin labels are provided by l epoi te i 

20 spin which can be detected by electron qnnxesonance 

spin labels include organic free radicals, transitional metal 
vanadium, copper, iron» and manganese, and the like, 
nitroxide free radicals. 

The label may be added to the taiget (samp 

25 after the hybfidization. So called "direct labels** are 

attached to or incotporated into the target (sanq)le) nucleiGl acid 
contrast, so called "indirect labels" are joined to the hybrid dupli 
Often, the mdirect label is attached to a binding moiety 
nucleic add prior to the hybridization. Thus, for example, 

30 biotinylaledbeto the hybiidizBticHL After hybridizBtion, 

fluorophore will bind the biotin bearing hybrid di^lexes plxwiding 



r moleci Ics 



;(ESR) 



i detect iMe 
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by irradiating a fluoxescer with 
[e label can provide for a 



qhemiluminesccnt and 

a compound which becomes 
light which serves as the 
A diverae number of families 
under a variety or 
,.4-^hthaIazinedione. The must 
Other members of the 
lamino[ca]benz analog. 
Iqfdrogen penxxide or calcium 

2,4,S-lriphenylimidazoles, 
Chemiluminescent analogs 
C hemiluminescence can also be 
p mtropheayl and a penndde, e.g., 
tuciftrins can be used in 
btolui unescence. 



with an unpaired electron 
spectroscopy. Exemplary 
complexes, paiticulariy 
spin labels include 



Exenphuy 



e) nucleic acid(s) prior to, or 
labels that are directly 
prior to hybridization. In 
ex after hybridization, 
has been attached to the target 
the target mideic add m^ be 
m avidiifr^iQi^ated 

a labd that is easily 
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f adced 



detected. For a detailed review of methods of labeling nuclei : 
hybridized nucleic adds see Laboratory Techniques in 
Biology, Vol 24: Hybrit^ion With Nucleic Acid Probes, 
(1993)). 

Fluorescent labels are prefeii e d and easily 
transcription reaction. In a preferred embodiment, fluoresceili 
incorporated into the RNA produced in an m vitro transcriptii >n 
above. 

The labeb can be attached directly or through 
the site of label or linker-label attachment is not limited to 
example, a label may be attached to a nucleoside, nucleotide, 
position that does not interefere with detection or bybridizatii 
certain LabeKIN Reagents £rom Ckmlech (Paio Alto, CA) 
inteiqmrsed throu^iout the phos|toB badcbone of an oUgomlcleotide 



PCT/US97A1M3 

acids and detecting labeled 
Bioch emistry and Molecular 

Tijssen, ed. Qscvier, N. Y.. 



>an7 



pi>videi 



labeling at the 3* and 5' ends. As shown for exanqile herein, 
positions on the ribose ring or the ribose can be modified and 
The base mioeties of useful labeling reagents can include tho^ 
or modified in a maimer that does ix>t interfere with the 
Modified bases include but are not limited to 7-deaza A and 
and other heterocyclic moieties. 



spwpose 



riabil 



£ Emd4aUlmgHUclekaculs. 

In many q^plications it is useful to directly 
without having to go through an amplification, transcription 
conveision step. This is especially true for monitoring of mRflA 
like to extract total cytoplasmic RNA or poly A+ RNA (mRN \) 
this material without any intermediate stqis that could skew 
mRN A concentrations. 

In general, end-labeling methods permit the 
micleic add to be labeled. Ehd-labding methods also decieask 



during an in vitro 
labeled UTP and CTP are 
reaction as described 



I linker moiety. In general, 
specific position. For 
or analogue thereof at any 
as desired. For example, 
for labeling 
and for terminal 
l^ls can be attached at 
even eliminated as desired 
that are naturally occurring 
to which they are put. 
. 7-deaza-8-aza A and G, 



nucleic acid samples 
other nucleic add 
levels vd^n one would 
fixmi cells and hybridize 
t&e original distribution of 



opimizatii 



on of the size of the 
the sequence bias 
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sometimes associated with polymerase-facilitated labeling 
perfonned using temunal transferase (TdT). 

End labeliog can also 
or analog thereof to the end of a target nucleic acid or 
include the creation of a labeled or unlabeled **taiV* for 
temiinal transferase, for example. The tailed nucleic acid 
moiety that will preferentially associate with the tail. The 
prefcrentiall y assodaies with the tail can be a polymer siu ;h 
carbohydrate. The tail and its recognition moiety can be 
between the two, and includes molecules having 
haptens, epitopes, antibodies, enzymes and their substrate^ 
adds and analogs thereof. 

The labels associated with the tail or the 
d et ectable moieties. When the tail and its recognition 
reqsective labels associated with each can themselves hav i 
The respective labels can also comprise 
difierem spectroscopic characteristics. Theenetgy 
desired combined spectral characteristics. For example, 
wavelength shorter than ihst absorbed by the second dye 
shorter wavelength, transfer energy to the second dye. 
electromagnetic radiation at a wavelength longer than wwild 
dye alone. Energy transfer reagents can be particularly 
schemes such as those set forth in a copending U.S. patent 
1996» Attorney Docket No. 2013.2, and which is a 
0S/S29,1 1S« filed Sq»tember IS, 1995. and Im'l Appbu No , 
13, 1996, which is also a continuation-in-pBrt of USSN 08 
which is a divisi<» of USSN 08/168,904, filed December 
of USSN 07/624,1 14, filed December 6, 1990. USSN 
07/362,901, filed June 7, 1990, incorporated herein by 

This invention thus provides methods 
reagents useful therefor. Many <^ the methods disclsoed 



a labeled oligonucleotide 
Other end-labeling methods 
nucleic acid using ligase or 
s then exposed to a labeled 
tail and Ae moiety that 
as a nucleic acid, peptide, or 
lything that permits recognition 

relationships such as 
and complementary nucleic 



imoety 



J energy transfer re igents such 



f transfc rpair 



a fust 
can. 



Th: 



useful 



I contini ation- 



I07/i24, 



methods. End labeling can be 



iccognition moiety include 

are both labeled, the 
a ligand-substrate relationship, 
as dyes having 
can be chosen to obtain the 
dye that absorbs at a 
upon absorption at that 
second dye then emits 
have been emitted by the first 
in two-color labeling 
application, filed December 23, 

in-partofUSSN 
WO 96/14839, filed September 
'670,118, filed on June 25, 1996, 
5, 1993, which is a continuation 
.114isaCIPofUSSN 



ref( rence.. 
of lalding 



a nucleic acid and 
Merein involve end-labeling. 
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rligatiig 



with] 



Those skilled in the ait will appreciate that the invention as 
applicable in the chemical and molecular-biological aits. 

In one embodiment, the method involves 
piovidizig a labeled oligonucleotide and en^rmatically 
nudeieadd Thus» for exan^le, where the nucleic acid is 
riboligODudeotide can be ligated using an RNA ligase. 
joining of single-stranded RNA (or DNA« but the reaction 
with a S' phosphate gioiq> to the 3*-OH end 
specific requirements for the use of this enaorme are provide I 
Part B, T4 RNA ligase^ Uhlenbeck and Greensport» pages 3 
Sambrook c/ al. . Moleadar Cloning, A Laboratory Manual, 
Cold Spring Harbor, New York (1982) 

This inventioD thus provides a method to add 
extracted RNA) directly rather than incorporating labeled 
polymerization step. This can be accomplished by adding a 
to the ends of a single stranded nucleic acid. The method 
higher percentage of aivailable molecules will be 

RNA can be randondy fragmented with beat 
generally produces RNA fragments with 5' OH groups and 
phosphate group is added to the 5* ends of the fragments 
Polynucleotide Kinase, or similar eazyme. Totiiepoolof S'- 
fragments is added RNA ligase plus a short RNA ol 
label, either at the S* end (such as fluorescein or other dye, 
stieptavidin conjugate, or with dioxigenin for later labeling 
with one or more labeled bases. A riboA« (deoxyribonucleic 
with either fluorescein or biotin at tiie 5* end provides a 
another embodiment, the ligated RNA oligonucleotide 
ligaticMi end, but deoxyrigonucleotides further away, 
can be longer or shorter and can have a viitually any 
reaction is most efifident witfi A and least efficient with U 
The reaction is allowed to proceed under standard 



( lisclosed is generally 



m^rel 



t labeled than by 



using 



oligonuel »dde 



can liaver 



. Of CO irse, 
f sequeice. 



I conditio OS. 
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proyiding a nucleic add, 

the oligonucleotide to the 
RNA, a labeled 
ligase catalyzes the covalent 
RNA is more efficient) 
ofanother piece ^f RNA (or DNA). Hie 

in The Enzymes, Volume XV, 
-58; and 5.66-5.69 in 
Cold Spring Harbor Press, 

a label to the nucleic acid {e.g, 
m deotides in a nucleic acid 
short labeled oligonucleotide 
frilly labels a saiiq>le; a 
amvendonal techniques, 
in the presence of Mg'\ This 
] ^osphorylated 3' ends. A 
standard protocols with T4 
phospboiylated RNA 

with a 3' OH group and a 
biotin for later labeling with a 
with a labeled antibody) or 
acid 6 mer poly A) labeled 
particularly preferred label. In 

rioilmucleotides near the 
the RNA oligonucleotide 
However, the ligation 
at the 3' end of the acceptor. 
Uninoorpotated RN A 6- 
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mers can be removed by a simple size selection step (e.^. 
etc) if necessary foUowiog the ligation reaction. 

An advantage of this procedure is 
and that each fiagment should be labeled once, not any 
sequence as is the case when labeled bases are 
reactions. 

In anotber embodiinentp fragmented DN A 
diffennt procedure with a diffoeot engine. Tenninal 
deoxynucleoside tripho^hates (dNTPs), which can be 
stranded DN A. Single dNTPs can be added if modified 
dideoxynucleotide triphosphates), or multiple bases can be 
fragmented other physically (shearing) or en^matically (i 
add hydrolysis). FoUowing fragmentation, depending on 
need to be produced. The DNA fragments are then labelec 
ddNTPs in the prescDoe of tenninal tiaitsferase. 

Various other embodiments are illustrated 
and their associated figures. 



VCt/VSSrjJOim 



dectrophoresis, NAP oolunm. 



s that extracted mRNA can be used directly 
nuhaber of times depending on the 
incorporate i during polymerization 

also be end-labeled using a 
will add 

labeled, to the 3' OH ends of single 
ni cleotides are used (for example, 
added if desired. DNA can be 
ludeases), or diemically (e.^. 
he method, 3' OH ends may 
using labeled dNTPs or 

I y the Examples provided hoein 



Ratio, 



Q Mod^^gSmipteto improve S^ttai to Noise 
The middc acid sample may be modified 
dcnsily probe array in order to reduce 
signal and iinproviitg sensitivity of the ixieasurement In 
reduction for expression monitoring methods is achieved 
background mRN A. TUs is accomplished by hybridizing 
RNA) with a pool of DNA oligonucleotides that 
w^ch the probes in the expression 
embodiment, the pool of oligonucleotides consists of the 
found on the high density array. 

The pool of oligonucleotides hybridizes to 
rmmber of double stranded (hybrid diqilex) nucleic acids, 
treated with RNase A, a nuclease that q)ecifically digests single 



1 monitoring array specif cally hybridize. 



p ior to hybridization to the high 
sanqde complexity t lereby decrearing background 



i embodiment, complexity 
selective degradation of 
he sample mRNA (e.^., poly A* 
with the regions to 
In a preferred 
probe oligonucleotides as 



hybridize specifically 



t le sample mRNA forming a 
' rhe hybridized sample is then 
strandedRNA. The 
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soaratedi 



p(ol 



RNase A is then inhibited, using a protease and/or commen 

inhibitors, and the double stranded nucleic acids are then 
single stianded RN A. Ibis sqmation may be acoompiishei 1 
known to those of skill in the ait including, but not limited 
centiifugatioiL However, in a preferred embodiment, the 
provided attached to beads forming thereby a nucleic acid 
with the RNase A, the hybridized DNA is removed simply 
heat or increasing salt) the hybrid duplexes and 
off in an eludon buffer. 

The undigested mRN A fragments which wil 
the high density array or other solid support are then prefer^ly 
fluoiopfaore attached to an RNA linker using an RNA ligas< 
labeled sample RNA pool in which the nucleic adds that dc 
the array are eliminated and thus unavailable to contribute 

Another method of reducing sample complejcity 
mRN A with deoxyoligonucleotides that hybridize to region % 
regions to which the high density array probes are directed* 
selecdvely <tigests the double stianded (hybrid duplexes) 
mRNA corresponding to the short regions (e.g., 20 mer) 
deoxyoligonucleotide probes and which correspond to the 
probes and longer mRNA secpienoes that corresp on d to 
probes of the high density amy. The shoit RNA fragment 
long fragments (e.g. , by electrophoresis), labeled if necessa ry 
are ready for hybridization with the high density probe am y, 

In a third approach, sample complexity reduction 
removal of particular (preselected) mRNA messages, hi 
mRN A messages that are not qiecifically probed fay the 
preferably removed. This approach involves 
oligonucleotide probe that specifically hybridi2Es to the 
(polyA)end. The probe may be selected to 
reactivity. Treatment ofihe hybridized message/probe coijoqilex 



psfticular, 
pp ibes 
J hybridizing tJ le poly A' 
pr selected 
> provide high i pedficity 



ially available RNase 

from the digested 
in a number of ways well 
\ electrophoresis* and gradient 
of DNA oligonudeotides is 
ajffinity colunm. After digestion 
1 »y denaturing (e.g., by adding 
washing tbc previously hyhridized mRNA 

be hybridized to the probes in 

end*labeled with a 
. This procedure produces a 
not oofxespcmd to |xobes in 
t a background signal. 

involves hybfidLdng the 
that border on either side the 
Treatment with RN Ase H 

a pool of single-stranded 
were formerly bounded by the 
t^ets of the high density array 
between the targets of the 
are then separated from the 
as described above, and then 



le iviitg a 



)th2t 



iregons 



invoWes the selective 

r, highly expressed 
in the high density anay are 
: mRNA with an 
message close to the 3' 
and low cross 
with RNase H digests the 
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double stranded region efifectively removing the poly A* 
The sample is then treated with methods that 
(e.£., an oligo dT column or (dT>n magnetic beads), 
ampliiy the selected nkessage(s) as they are no longer 
highly expressed messages are effectively removed from 
that has reduced background mRNA. 



t specifically retail 



Sues 



ass>ciat6d> 
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1 ail from the rest of the message. 
D or ampliiy poly A* RNA 
methods will not retain or 

with a polyA"^ tail. These 
the sample i^viding a sample 



Ge lenc 



/K Hybridizaiion Array DesigiL 

A) Probe Con^outioiu 

One of skill in the art will tqjprectate thai 
designs are suitable for the practice of this invention, 
for C3cample may inchxle random, haphazardly selected, 
Ahematively, ifae generic difierence screening anays ma; 
oligcmucleotides of a particular pre-selccted length. Conjeisely, 
monitoring arrays typically include a number of probes 
nucleic acid(s) e]q»esstoa of which is to be detected. In 
will include one or more control probes. 



tiat 



I density 



1) TestprtAes,. 

In its simplest embodiment, the high 
(also referred to as probe oligomicleotides) more than 5 
10 bases long, and some more than 40 baes long. Insom^ 
less than 50 bases long. In some cases, 
45 or 5 to about SO nucleotides long, more preferably 
nucleotides long, and most preferably from about 15 to 
other particularly p i cf e ii e d embodiments the probes are 
preselected expression monitoring arrays, these probe 
complementary to particular subse q uences of the genes 
designed to detect Thus, the test probes are 
target mideic add thQT are to detect 



bisesi 



, these oligonucleo tides 



rfroiQ 



alout^ 



20( 



^hosec 



s G^>able of s] >ecifically 



enormous number of array 
ic difference screeing arrays, 
aribiraiy probe sets, 
include all possible 
; other e}q)ression 
specifically hybridize to the 
preferred embodiment, the airay 



auay includes "test probes* 
long, preferably more than 
embodiments, the probes are 

range fiom about 5 to about 
about 10 to about 40 

40 nucleotides in lengdt In 
or 25 nucleotides in length. In 
oligonucleotides have sequences 
expression they are 
hybridizing to the 
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, design m1 



nuc] eotide 



In high density oligonucleotide arrays, 
screening, the probe oligonuclootides need not be selected to 
preselected subsequences of genes. To the oontraiy, 
amys coni|nise probe oligonucleotides whose y^irnrCT are 
5 haphazard. Alternatively, the probe oligonucleotides may 
of a given length (e.g.. all possible 4 mcrs, all possible 5 
possible 7 roers, all possible 8 roers, all possible 9 mas, all 
1 1 mers, all possible 12 mers, e/c.) 

A random oligDnucleotide an^ is an arr^ in 

10 srquenrrs of a particular length does not significantly deviate 
sequences selected in a random manner {i.e., blind, unbiased 
all possible sequences of dial length. 

Arbitrary or haphazard nucleotide anays o1 
arrays in v^ch the probe oligonucleotide selection is selecte( I 

15 preselecting taiget nucleic acids. Arbitrary or haphazard 

{QSproximate or even be random, however there in no assuran^ 
definition of randomness. 

The arrays may reflect some nucleotide selectjlon 
composition, and/or nory-rrdiirrdancy of probes, and/or coding 

20 herein. In a preferred embodiment, however such "biased' 
to be specific for any particular genes. 

An amy comprising all 
refers to an array that contains oligonucleotides having 
substantially every pennutation of a sequence. Thus since 

25 this invention preferably include to 4 bases (A, G. C, T) 
of these bases, an array having all possible nucleotides oi 
4^ dififerent nucleic adds {e.g., 16 different nucleic adds for 
acids for a 3 mer, 65536 different nucleic adds for an 8 mer, 
that some small number of sequences may be inadvertently 

30 possible nucleotides of a particular length due to synthesis 
etc.). Thus, it will be appredated diat an array comprisit^ al 



for generic difference 
hybridize to particular 
1 generic difference screening 
random, arbitrary, or 
include all possible nucleotides 

all possible 6 mers, all 
pbssible 1 0 mers, all possible 



>mers 



w^ch the pool of nudeotide 
from a pool of nucleotide 
election) from a collection of 



f pr >be oligonudeotides are 



^th; 



without identifying and/or 

arrays may 
that they meet a statistical 



based on probe 
sequence bias as described 
sets are still not chosen 



piobe 



possible oligonudeoti des of a particular length 



Ssequeices 



corresponding to 
probe oligonucleotides of 
(A. G, C, U) or derivatives 
X contains substantially 
2 mer, 64 differem nuddc 
uc). It will be qyprcciated 
from a pool of all 
pr{)blema, inadvertent deavage, 
possible nucleotides of 
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nucl Kitides. 



laboic 



length X refers to an array having substantially all possib 
Substantially ail possible nucleotides of length X include 
than 95%. piefetably than 98%, more preferably 
more than 99.9% of the possible number of different 

The probe oligonucleotides described 
constant domain. A constant domain being a nucleotide 
substantially all of the probe otigonculeotides. Paittcularly 
located at the terminus of the oligonucleotide piobe closept 
die linkei/anchoriDoiecule). The constant regions may 
However, in one embodiment, the constant r^ons comprise 
complementary to the sense or antisense strand of a restri 
lecogntzed by a restriction endonudease). 

The constant domain can be synthesized 
Ahematively, the constant regicm may be prepared in a 
covf>led intact to the anay. Since the constant domain < 
then the intact constant subsequmoes coupled to 
domain can be virtually ax^ length. Some constant don^ins 
about SOO nucleotides In length, more typically firom 
100 nucleotides in length, most typcically from 3 nucleotides 
nucleotides in leng;th. in particular embodiments, consta it 
nucleotides to about 45 nucleotides in length, more 
to about 25 nucleotides in length and most preferably 
nucleotides in length. In other embodiments, preferred 
5 nucleotides to about 15 nucleotides in length. 

In addition to test probes that bind the target 
high density amy can contain a number of control probe s. 
three categories referred to herein as 1) Normalization 
controls; and 3) Mismatch controls. 
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e nucleotides of length X. 
more than 90%, typically more 
than 99%, and most preferably 



> the high density 



iprefenbly 



rfron 



can additionally iticlude a 
Subsequence that is common to 
preferred constant domains are 
to the substrate (/.«.. attached to 
emprise virtually any sequence, 
a sequence or subsequence 
:tion site (a nucleic acid sequence 

novo on the amy. 

procedure and then 
be synthsized separately and 
amy, die constant 
range from 3 nucleotides to 
3 nucleotides in length to about 

in length to about 50 
domains range fiom 3 

from 3 nucleotides in lengdi 
3 to about 15 or even 10 
cbnslant regions range from about 



ocntrols; 



nucleic add(s) of interest, the 
The control probes fell into 
r, 2) Ex|»es5ion level 
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controls after 



2) formalization eontrois, 

Nonnalizadon controls are oligonucleotide p{obes that 
oompleoientaiy to labeled reference oligonucleotides that 
sample. Ilie signals obtaxiied fiom die nonnalizatioii 
control for variations In hybridization conditions, label interisity, 
other foctors that may cause ^ signal of a perfect hybridiza lii 
a preferred embodiment, signals (eg., fluorescence intensity ) 
the anay are divided by the signal {e,g., fluorescence intensi 
thereby normalizing the measurements. 

Virtually any probe may serve as a nomializa^ 
recognized that hybridization efHciency varies with base 
Preferred normalization probes are selected to reflect the 
present in the anay. however, they can be selected to cover 
normalization control(s) can also be selected to reflect the 
the other probes in the array, however m a preferred embodiment, 
normalization probes are used and they are selected such tha : 
secondary sUucture) and do not match any taiget-^)ecific pr >bes. 

Normalizatioo probes can be localized 
multiple positions throughout the array to control for spsdal 
efiBdently. In a preferred embodiment, the normalization 
comers or edges of the array as well as in the middle. 
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are perfectly 
added to the nucleic acid 

hybridization provide a 
\ "reading" efBciency and 
on to vary b^ween arrays. In 
read from all other probes in 
y) from the control probes 



cor iposition t 



av< rage 1 



3) Expnssiom ievei contrvis. 

Expression level controls are jnobes that hyb^dize 
constitutively expressed genes in the biological sample, 
designed to control for the overall health and metabolic acti^ty 
thecovarianceof an ex|»ession level control with the 
acid indicates whether measured changes or variations in 
to changes in transcripticm rate of that gene or to general 
Thus, for example, when acell is in poor heahfa or lacking 
expression levels of both an active target gene and a 



on control. However, it is 
and probe length, 
length of the other probes 
range of lengths. The 
(a|verage) base composition of 
only one or a few 
they hybridize well (i.e. no 

at any position in the array or at 
variation in hybridization 
cc ntzols are located at the 



i exprei sion 



q^ecifically with 
sion level controls are 
of a cell. Examination of 
level of the target nucleic 
Level of a gene is due 
in health of the cell 
g| critical ntetabolite the 

gene are 



ex nession l 
vai iations i 



oonstiti itively expressed ( 
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swlxrc 



expected to decrease. The conveise is also true. Thus 
an expression level control and the target gene appear to 
the change may be attributed to changes in the metabolic 
to dififerential expression of the taiget gene in question, 
levels of the targ^ gene and the expression level control 
expression level of the target gene is attributed to 
not to overall variations in the metabolic activity of the cel)< 

Virtually any constitutively expressed 
expression level controls. Typically expression level contrbl 
complementary to subsequences of constitutively expresses 1 
including* but not limited to the B-actin gene, the 
gene, and the like. 



the expression levels of both 
decrease or to both increase, 
abtivity of the cell as a whole, not 
C mveisely, where the expression 
1 d< ► not CO vary, the variation in the 
dilferenc es in regulation of that gene and 
lecel . 

geoe provides a suitable target for 
probes have sequences 
"housekeeping genes'* 
transfemii receptor gene, the G APDH 



4) hfbtiuttch controis. 

Mismatch controls may also be provided foi 
for expiession level controls or for nonnalization controls, 
oligonudeotide probes identical to their corresponding test 
presence of one or more mismatched bases. A mismatched 
is not complementary to the conesponding base in the 
would otherwise specifically hybridize. One or more 
under appropriate hybridization conditions (e.g. stringent 
probe would be expected to hybridize with its taiget 
would not hybridize (or would hylmdize to a significantly 
mismatch probes contain a central mismatch. Thus, 
mer, a corresponding mismatch probe wUl have the identic^ 
base mismatch (e.g. , substimting a G, a C or a T for an A) 
(the central mismatch). 

In "generic" ie.g., random, arintzaiy, 
nucleic acid(s) are unknown peifixt match and mismatch 
deteimiDed, designed, or selected. In this insriance, 
pairs where each pair of probes differ in one or more 



, theproics 



s preset scted 
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the probes to the taiget genes. 
Mismatch controls are 
or control probes except for the 
base is a base selected so that it 
sequence to which the probe 
are selected such that 
cfmditions) the test or control 
but the mismatch probe 
I esser extent). Preferred 

where a probe is a 20 
sequence excqn for a single 
any of positions 6 through 1 4 



stargit 



misnatchess 



t sequel ce. 



for exf mple. 



p obes c 



haphaTprd, etc ) arrays, since the taiget 
cannot be a priori 
are preferably provided as 
nucleotides. Thus, while 
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SO 



taiget! 



sequence. 



beprovded 



it is not known a priori which of the probes in the pair is the 
that vAvtn one probe specifically hybridizes to a particular 
of the pair will act as a misznalch control for that target 
that the perfect match and mismatch probes need not 
provided as larger ooUectiQiis (eg., 3. 4, 5, or mm] 
in particular preselected nucleotides. 

In both expression monitoring and generic di 
mismatch probes provitfe a control for non-specific binding 
nucleic acid in the sample other than the target to which the 
Mismatch probes dnis indicate whether a hybridization is 
the complementary target is present the perfect match 
brighter than the mismatch probes. In addition, if all central 
mismatch probes can be used to detect a mutation. Finally, i 
present invention dmt the difference in intensity between the 
mismatch probe QCPM^ICMhQ) provides a good 
hybridized material. 
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perfect match, it is known 
sequence, the other probe 
It will be appreciated 
as pairs, but may be 
;) of probek that differ from each other 



difeimces 



cr 



screemng arrays, 
cross-hybridization to a 
ijrobe is complementary. 

or not For example, if 
probes Ishould be consistently 

I nismatches are present, the 
i1 was also a discovery of the 
xrfect match and the 
ooncentratton of the 



I measure of he 



acids 



S) Sample pnparallon/ampUfleadon/qi 

The high density array may also include samp 
control probes. These arc jwobes that are complementary to 
selected because they do not normally occur in the nucleic 
sample being assayed. Suitable sample preparation/amplific^i 
for example, probes to bacterial genes (e.gi. Bio B) where tin 
biological from a eukaryote. 

The RNA sample is then spiked with a knowr 
which the sample preparation^am|^cation control probe is 
Quantification of the hybridization of the 
then provides a measure of alteration in the abundance of ^ 
processing steps (e.g. PCR, reverse transcription, in vitro 

Quantitstion controls are similar. Typically tliey 
sample nucleic aad(s) in known amounts prior to hyfaridizat on. 



i sample preparatioc famplificaiion 



tna iscription, t 



m controls, 

!e preparation/ampUfication 
ibseqtienccs of control genes 
of the particular biological 
ion control probes include, 
sample in question is a 



amount of the nucleic acid to 
4 lirected before ] ffocessuig. 

control probe 
nucleic acids caused fay 
etc). 

axe combined with the 
They are useful to 
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com wsiun I 



ch<»ice 



r example, pi obes 



lAs,<:s. 



provdie a quantitiation reference and permit detenninatio] i of a standard curve for 
quantifing hybridization amounts (concentratioos). 



B) Prob< S^Uctiom and €)ptittdeatM0tu 

9 Generic ^emce screening arrays 

a) Asswnption-/ree proU seiectio^ 

As explained above, probe oligooculetide 
screening arrays can be random, arbitrary haphazard, 
possible oligoDCuleotides of a particular length. Probe 
assumption 6ce. In some embodiments, however, particul 
excluded from the array or from analysis. For 
sequences or probes that contain long stretches of all 
Probes for exclusion may be identified by hybridizing a 
multiple times and/or hybridizing differem copies of the 
that show that show an unacceptable variation (variation 
value) in fajTbridization intensity against the same sample 
oonslniction or in signal analysis). The variation level at 
a function of the sensitivity desired of the assay. The morj; 
the lower the exclusion threshold is set In a preferred emi 
vdien the variation in hybridization intensity exceeds 2 
has a reladve variation of more dian 50%. 

Alternatively such exclusion may be inherent 
of differentially hybridizing sequences \^ere the differenc e 
sample and a reference nucleic acid sample is compared tc 
reference nucleic add sample and itself. This is described 
D((B). 



b) Es^loitation of codon degenerac y. 
In another embodiment, species-spedfic co|lon 
utilize a longer (and hence more 
of probe oligonucleotides necessary to hybridize to all 
codons are conserved in the first and second position of th^ir 



: spedfic and stable) probe without increasing the 
posuble 
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flection for generic difference 
biased, or include all 
is thus essentially 
IT oligonudeotides may be 
that contain palindoimic 
Gs, Ts. etc may be exduded. 
airay to the same sample 
to the same saii^>le. Probes 
a particular threshold 
be excluded (dther in array 
>^ch8 probe may be excluded is 

sensitive an assay is desired, 
ifxNiiment, the probe is excluded 
the background signal and 



siigles 

airay t 



ax)ves 



I lay I 



s tines 



in the selective identification 
between a test nucleic acid 
the difference between the 
more fiilly below in Section 



usage can be exploited to 

number 
sequences. Amino acid 
codons, while the third 
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rorgaiism 



position is highly redundant Moreover each species or 
to encode any particular amino acid. The pteferred codon fb 
particular species being the codon that is used at the highest 
Codon preferences are well known to those of skill in the ait 
detennined fay a simple frequency analysis of the nucleotide 
oigamsm or species. 

Similarly, the di» tri-, tetra-nudeotide frequen^ 
organism or species can be used to wdght the 
"composition biased" generic difference screening array. 

In one preferred embodiment, the probe 
having the first two nucleotides in each codon being fixed bu : 
to vary (either by use of a 4 way wobble or by the use 
hybridizing base). In a preferred embodiment, each codon 
general fbnnula 

3 • -x^-r'-i-s • 

where I is inosine or a 4-way wobble and X' and are A, G, 
the preferred codon usage for a particular species. Thus, for 
that will hybridize to substantially all nucleic acids of a 
where the probes have the formula: 

with only 4'° differem probe oligonucleotides. Suitable 
ilhistrated in Table 1. 

Table 1. Preferred sequeiKXS for generic cxtding sequence Id 



favors particular codons 
a particular amino acid in a 
; requency for that species. 
They can also be readily 
sequences of a paiticular 



biases of an particular 

used in 



selection of ol gonucleotide probes 
oligonucleotides 



of inosine 
otthe 



ipartii ular 



codois 





Codon 5 


Codon4 


Codon 3 


Cc 


don 2 


Codon 1 










X' 


X* 


r 


X' 


X' 




X» 




X'^ 




X'* 


X" 


|I6 




G 


A 




G 


A 


I 


G 


A 


I 


G 




A 




0 


A 






A 


A 




A 


A 


I 


A 


A 


I 


A 




A 




A 


A 






C 


T 




C 


T 


I 


C 


T 




C 




T 




C 


T 






G 


C 




G 


C 


I 


G 


C 


1 


G 




C 




G 


C 






C 


A 




C 


A 


1 


C 


A 


I 


C 




A 




C 


A 
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are prepared 
allowing the third nucleotide 
or other non-q)eciflcaIIy 
probe will have the 



C, TAJ selected according to 
4 example, an array of 1 6 mers 
species can be prepared 



:"l"-X**X»I«-3' 
for this probe are 



mer probe oligonucleotides. 
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2-way wobble \ or other generic bases) 



These codon i sage 

rincrea;e 



The affinity of the probes may be further fxihanced 
additional intosines, (or 4,-way, 3*way, or 
end S* ends of the oligonucleotide probes, 
conjunction with a ligase discrimination to further i: 
information. Thus, for example, where the hybridization 
described 16 mers also inchides a ligation with one or 
fixed length N, whose sequence is known, each successfi^ 
nucleotides of sequence infonnation. 



by the includsion of 

to the 3' 



biased probes can be used in 
obtainable sequence 
to an airoy comprising the above« 
e ligatable oligonucleotides of 
ligation provides 16 + N 



H) ExpranoH monitoring amys^ 

In a piefeired embodi m e nt , 
monitoring high density anay are selected to bind specil 
which they are directed with minimal non-speciitc 
particular hybridization conditions utilized. Because the 
invention can contain in excess of 1,000,000 
eveiy probe of a characteristic length that binds to a 
Thus, for example, the high density mojf can contain 
compiementaiy to an lL-2 mRNA. 



oligonucleotic ie probes 
fically 
binding 
sethehi 
difiereot pr ibes, 

part cular nucleic 



eway 



in the expression 
to the nucleic acid target to 
or cross-hybridization under the 
high density anays of this 
it is possible to provide 
acid sequence, 
possible 20 mer sequence 
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There, may exist, however, 20 mer subsequencps 
IL-2mRNA, Probes directed to these subsequences are 
occurrences of their complementary sequence in other regions 
Similarly, other probes simply may not 
conditions (e.g:, due to secondary structure, or interactions 
probes). Thus, in a fHcferred embodiment, the probes that 
hybridization efficiency are identified and may not be included 
array itself during fabrication of the array) or in the post 

In addition, in a preferred embodiment, 
used to identify the presence and expression (transcription) 
several himdred base pairs long or longer. For most 
identify the presence, absence, or expression level of several 
thousand genes. Because the munber of oligonucleotides per lanay 
embodiment, it is desiied to include only a limited set oi 
^^lose eiqjfession is to be detected. 



; expected 



t hybridize effectively i nder die hybridizatii 



;wih 
tshc w 



, expies lion 



Ic ^cl 



t applicatic tis 
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that are not unique to the 
to cross hybridize with 
of the sample geix>me. 

ion 

the substrate or other 
such poor specificity or 
either in the high density 
hybridization data analysis, 
monitoring anays are 
of genes which are 
it would be usefiil to 
tjhousand to one hundred 

is limited, in a preferred 
to each gene 



f profa es ^jedfic t 



I provides 



Ga crally, \ 



ff pr >bes f 



a) Bybnaizntioti and cross-hyMBzaiYfii 
Thus, in one onbodiment, this invention 
optimizing a probe set for detection of a particular gene, 
providing a high density array containing a multiplicity ol 
length(s) that are complementary to subsequences of the 
goie. In one embodiment the high density array may contau 
length that is eomplementary to a particular mRNA. The 
are then hybridized with their target nucleic acid alone and 
conqilexity, high conceotration nucleic acid sample that doe: 
complementary to the probes. Thus, for example, where 
the probes are first hybridized with their target nucleic acid 
RNA made from a cDNA Ubrary (e.g., reverse transcribed 
of the hybridized RNA is opposite that of the target nucleic 
complexity sample does not contain targets for the probes), 
strong hybridization signal with thdr target and UtUe or no 



for a method of 

this method involves 
of one or more particular 
by the target 
every probe of a particular 
of the high density array 
hybridized with a high 
not cxintain the targets 
the karget nucleic add is an RNA, 
41one and then hybridized with 
mRNA) where the sense 
; icid (to insure that the high 
Those probes that show a 



mR]^A transcribed I 



prcbesc 
tlenl 



c ros&-hybridizBiion with ttie 
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contain I 



high complexity sample are preferred probes for use in tfa : high density axrays of this 
invention. 

The high density array may additionally 
of the probes to be tested. In a prefened embodiment* the 
central mismatch. Where both the mismatch control and 
of hybridization (e,g., the hylxidization to the mismatch 



r contun 



1 mismatch a otrol 



the hybridization to the corresponding test probe), the tes 
die hi^ density array. 

In a particularly preferred embodiment, 

10 according to the foUowing method: First, as indicated ab >ve, 
containing a multiplicity of oligonucleotide probes comp 
target nucleic acid. The oligonucleotide probes may be 
variety of lengths. The high density array magr contain 
that is complementary to a particular mRN A or may 

IS regions of particular mRN As. For each target-specific pibbe 
mismatch control probe; preferably a ceotial 

The oligonucleotide array is hybridized 
micleic acids having subsequences complementary to 
dififeicnce in hylnidization intensity between each probe 

20 detennined. Only those probes where the difference 
connol exceeds a threshold hybridization intensity (eg. 
background signal intensity* mote preferably greater thai 
imensity and most preferably greater than 50% of the 
selected. Thus, only probes that show a strong signal 

25 are selected. 

The probe optimization procedure can 
selection. In this selection* the oligonucleotide probe 
acid sample that is not expected to contain sequ en ces 
for example, where the probes are c<nnplementary to 
30 antisenseRNA is provided. Of course, other S8ixq}les 



ofjtimal probes are selected 
; an array is provided 
ementary to subsequences of the 
a single length or may span a 
efeiy probe of a particular lengdi 
probes selected from various 
the array also contains a 
probe. 



mismatch controls for each 
misnoatch controls contain a 
[he target probe high levels 
; nearly equal to or greater than 
probe is preferably not used in 



to a saixq>le containing target 
the oligonucleotide {Hobcs and the 
md its mismatch control is 
the probe and its mismatch 
[[referaUy greater dian 109^ of the 
20% of the liackground signal 
signal intensity) are 
to their mismatch control 



bet\/eent 



bat kground s 



coinparedt 



op^mially include a second round of 
i anby is hybridized with a nucleic 
CO] dplementary to the probes. Thus, 
the RNA sense strand a sample of 
ooMld be provided such as samples 
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fix>in organisms or celi lines known to be lacking a particulaj gene, or known for not 
expressing a particular gene. 

Only those probes where both the probe and 
hybridtzatioii intensities bdow a threshold value (e.;. less thfan 
background signal imensity, preibably equal to or less 
signal intensity, more preferably equal to or less than about 
intensity, and most preferably equal or less than about half background 



selected. In this way probes that show minimal non-specific 
in a piefei red embodiment, the n probes (where n is the numl »er 
target gene) that pass both selection criteria and have the hig] test 
each target gene are selected for incorporation into the array, 
the array, for subsequent data analysis. Of course, one of ski 1 
that either selection criterion could be used alone for selectio i of probes. 



mismatch control show 
about 5 times the 

times the background 
times the background signal 

signal intensity) are 
binding are selected. Finally, 
of probes desired for each 
hybridization intensity for 
or where already present in 
in the art, will appreciate 



b) Heuristic rules. 

Using the hybridization and cross-hybridizatii 
above, graphs can be made of hybridization and cross-hyl 
various probe properties e.g., number of As, number of Cs ii 
palifkk>mic strength, ere. The gr^hs can then be examined 
properties and the hybridization or cross-hybridization i 
beyond which it looks like hybridizaticm is always poor or 
very strong. If any probe fiub one of the criteria, it is 
therefore, not placed on the chip. This will 

One set of rules developed for 20 mer probes 
following: 

Hybridization rules: 

1) Number of As is less than 9. 

2) Number of Ts is less than 10 and 

3) Maximum run of As, Gs, or Ts is 

4) Maximum run of any 2 bases is les: ; 

5) Palindrome score is less than 6. 



I be called the heu istic 



data obtained as described 

intensities versus 
window of 8 bases, 
correlations between those 

Thresholds can be set 
hybridization is always 
set of probes and 
rules method, 
n this manner is the 



brid ization i 



mteiu ities. 



rcjecte 1 from the s 



less 



than 0. 
than 4 bases in a row. 
than 11 bases. 
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6) Clumping score is less than 6, 

7) Number of As + Number of Ts is 

8) Number of As + Dumber of Gs is 
>^th respect to rule number 4» requiring the maximum nu 
U bases guarantees that at least three difSerem bases occu 
nucleotides. A palindrome score is the maximum 
oligonucleotide is folded over at a point that maximizes 
example a 20 mer that is perfectly self-complementaiy 
1 0. A clumping score is the maximum number of three^m^ 
sequence. Thus, for example, a nm of 5 identical bases 
(bases 1-3, bases 2-4, and bases 3-5). 

If any probe fiuled one of these criteria (1-8 
of the subset of probes placed on the chip. For example, if 
AGCTTTTTTCATGCATCTAT-3' the probe would not be 
because it has a ran of four or more bases (ie., run of six). 

cross hybridization rules developed for 

1) Number of Cs is less than 8; 

2) Number of Cs in any window oft 
Thus, if any probe failed any of either the h 

cross-hybridization rales (1-2), the probe was not a 
on the chip. These rales eliminated many of the probes 
exhibited low hybridization, and peribnned moderate job o|f < 
hybridizing probes. 

These heuristic rules may be implemented 
alternatively, they may be implemented in software as is 



less than 14 
less than 15 

of any two bases to be less than 
within any 12 consecutive 
number bf complementary bases if the 
se f complementarity. Thus, for 
have a palindrome score of 
of identical bases in a given 
produce a clumping score of 3 



1 member of the 



ithat 



c) Neural net 
In another embodiment, a neural net can be 
hybridization and cross-hybridization intensities based on 
other probe properties. The neural net can then be used to 
"best" probes. One such neural net was 



the probe was not a member 
a hypothetical probe was 5*- 
synlhesized on the chip 

20 meis were as follows: 

bases is less than 4. 
bridization ruses (1-8) or the 

subset of probes placed 
cross hybridized strongly or 
eliminating weakly 



hand calculations, or 
ditsciissed below in Section Xll. 



rained to predict the 
t le sequence of the probe or on 
J rick an aiiritrary number of the 
This 



developed for sele Img 20-mer probes. 
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neural net was produced a moderate (0.7) correlation betwee a 
measured intensity, with a better model for cross hybridization 
of this neural net art provided in Example 6. 



t ma f 



» of consecut ve 



said 



d) ANOVA Modd 

An analysis of variance (ANOVA) model 
intensities based on positions of consecutive base pairs. Thi \ 
the melting energy is based on stacking energies 
was used to find condatioQ between the a psoht sequence 
hybridization intensities. The inputs were probe sequences 
base pairs. One model was made to predict hybridization, ai 
cross hybridization. The output was the hybridization or 

There were 304 (19 * 1^ possible inputs, 
base combinatiQns, and the 19 positions that those 
example, the sequence aggctga... has "ag** in the first posili< >n, 
"gc" in the third, "ct" in the fourth and so on. 

The tesultii^ nuxlel assigned a component o j the 
the possible inputs, so to estimate the intensity for a given 
adds the uitensities for each of if s 19 compcments. 



be built to model the 
is based on the theory that 
bases. The annova model 
the hybridization and cross- 
Broken down into consecutive 
other was made to predict 
cro^ybridization intensity. 

of the 14 possible two 
be found in. For 
'gg" in the second position. 



corsistingc 



: combinati ons could 



J veiy 



e) Fmnittg (removai) simliar protljp^ 
One of the causes of poor signals in expiessi<|n 
than the ones being monitored have sequences which are 
sequeiKes which are being monitored. The easiest way to 
which are similar to more than one gene. Thus, in a 
to remove (prune) probes that hybridize to transcription 

The simplest pruning method is to line up a 
genes for the organism being monitored, then count the 
example, given a probe to gene 1 of an organism and gene 2 



probe fiom gene 1 : aagcgcgatcgattatgctc 

I TlllTll 

gene 2: atctcggatcgatcggataagcgcg|atcgattatgctcggcga 
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predicted intensity and 
than hybridization. Details 



sequence 



output intensity lo each of 
one simply 



sdivct 



prefent d embodiment, i 



chips is that genes other 
similar to parts of the 
this is to remove probes 
it is desirable 
of more than one gene. 
]}roposed probe with all known 
tnatching bases. For 
of an organism as follows: 



products 



nun ber of c 
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has S matching bases in this alignment, but 20 matching t ases in the following alignment: 



probe fifom gone 1 : 
gene 2: 



M 



tiT 



< igatcgattatgctc 

Tllilllillllll 



atctC9gatcgatcg9ataagcg< igatcgattatgctcggcga 



More complicated algorithms also exist, which allow the 
mismatches. Such sequence alignment algorithms are wel 
and include, but are not limited to BLAST, or FASTA, or 
such as those described above in the definitions section. 

In another variant, where an organism has 
very similar, it is difficult to make a probe set that measi: 
those very similar genes. One can then prune out any 
make the pnhc set aprobe set for that £muly of genes. 



4 letection of insertion or deletion 
known to those of skill in the art 
other gene matching programs 



nany different genes which are 
the concentration only one of 
which are dissimihr, and 



l»0)es 



J) Synthesis cycle pnmimg. 
The cost of producing masks for a chip is 
the number of synthesis cycles. In a normal set of genes 
cycles any probe takes to build approximates a Gausian 
mask cost can normally be reduced by 1 5% by throwing 
In a preferred embodiment, synthesis cycle pruning simpl^ 
including) those probes those probes that require a greatei 
the maximum number of synthesis cycles selected for 
high density oligonucleotide array. Since the typical 
pattern of bases put down (acgtac^tacgt..) counting the 
to build a probe is easy. The listing shown in Table 1 
number of synthesis cycles a probe will need. 



30 Table 1. Typical code for counting synthesis cycles requifed for the chemical synthesis of 
aprobe. 



static char 

II 



= "acgf; 



abcdefghijklmno 



sjpproximately linearly related to 
distribution of theniunberof 
distribution. Because of this the 
Hit about 3 percent of the probes, 
involves eliminating (not 
number of synthesis cycles than 
of the particular subject 
of probes follows a regular 
of synthesis steps needed 
po^des typical code for counting the 



pre; Kuation < 



synt lesis < 



number c 



pqrstuvwxyz 
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D, 0,0.0. 3.0,0,0.0.0,0}; 



Static short indexQ = { 0. 0. 1 , 0, 0. 0. 2, 0. 0. 0. 0. 0, 0, 0. 0, 

short lookuptndex( char aBasc ){ 

if( isiq)per( aBasc ) || !isalpha( aBasc) ){ 
ciTorHwiMl( "illegal base"); 
return -1; 

} 

if( stTChi( base, aBasc ) = NULL ){ 
ent>rHwiid( "noiwliia base"); 
return 0; 

I 

return index[ aBase • *a']; 

) 

static short calculateNfiiiNumberOfSynthesisStepsFoxCompl|cmexit( char local * buffer ){ 
short i, last, current, cycles « 1; 
cbarbuirerl[40]; 
for( i =3D 0; buffcr[i] != 0; i++ ){ 

switch( tolowefOwtferfi]) ){ 

cascV: bufiFerl[i]-Y;break; 

caacV: bufiferlp] = '^;hicak; 

casc'fif: bufiferl[i] = V;bieak; 

case *f : bufferl[i] = 'a';bfeak; 

) 

) 

buffcrl[i]-0; 

bufferl [0] = 0 ) return 0; 
last =- lookupIndex( bufferl [0] ); 
for( i = 1; bufiferl [i] != 0; i++ ){ 

current = looleupIndex( bufiferl [i] ); 

ifi[ current last ) cyclc$+-^; 

last ^ current; 

} 

return (shortX(cycles -1 ) ♦ 4 + cunncnt +1 ); 



HMXiell 



g) CombbuOum of seleetion method . 
The heuristic rules, neuial net and annova 
or reducing Ae number of probes for monitoring the 
methods do not necessarily |»oduce the same results, or 
results, it may be advantageous to combine the methods, 
pnmed or reduced if more than one method (e.g.. two out 



; expres iion 



rproli 



provide ways of pruning 
of genes. As these 
luce entirely independent 
F^r example, probes may be 
o^ three) indicate the probe vnW 
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ncr 



thai 



i number of probes for mon loring multipli 



omputer system pn ceeds 



not Ukely produce good results. Then, synthesis cycle 
costs. 

Fig. 1 1 shows the flow of a process of ii 
monitoring the expression of genes after the number ol 
5 In one embodiment, a user is able to speciiy the number o 
be placed on the chip to monitor the expression of each 
advantageous to reduce probes that will not likely produa 
number of probes may be reduced to substantially less 

At step 402, the 

10 by the heuristic lules method, neural net, annova model, 
other method, or combination of methods. A gene is 

A determination is made 
selected gene number greater than S0% (\^ch may be 
desired number of probes. Ifyes,theco] 

15 408 which will generally return to step 404. 

If the remaining probes for monitoring the 
greater than 80% of the deshtd number of probes, a 
remaining probes for monitoring the selected gene numbei 
be varied or user defined) of the desired number of probes 

20 end oftfae gene name to indicate that after pnming, the 
At step 414, the number of probes is 
that rejected probes. For exaiiq>Ie» the thresholds in the 
1. Therefore, if previously probes were rejected if they 
be looseaod to five As in a row. 

25 A detennination is then made whether the 

the selected gene number greater than 80% of the desired 
yes, an "r" is appended to the end of the gene name at step 
were loosened to generate the munber of synthesized 

At step 420, acheck is made to see if the 

30 gene only conflict with one or two other genes. If yes, the 
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pn ning may be performed to reduce 



f pr >bes 1 



gene. 



ng the number of probes for 
has been reduced or pruned. 
^ nucleic acid probes that should 

As discussed above, it is 
good results; however, the 
the desired number of probes. 

le genes is reduced 
synthesis cycle pruning, or any 
atstep404. 

the 

or user defined) of the 
to the next gene at step 



selected 

whether the rcmai ning probes for monitoring I 
vaiied 



)increa;ed 



ha! 



1 prob s 



: elected gene do not number 
detenjiination is made whether the 
greater than 40% (which may 
If yes, an **i" is appended to the 
pnfbes were incomplete at step 412. 

by loosening the constraints 
hcjuristic rules may be increased by 
four As in a row* the rule may 



I nnainmg probes for monitoring 
1 lumber ofprobes at step 416. If 
412 to indicate that the rules 
for that gene. 

for monitoring the selected 
foil set of probes 



piobes 1 
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aid 1 



Step 426 



complementary to the gene (or target sequence) are taken 
lemaioing are exactly compleznentaiy to the selected gene 

A dctaminaAion is then made whether the 
the selected gene number greater than 80% of the desired 
yes, an "s" is appended to the end of the gene name at 
few genes were similar to the selected gene. 

At step 428, the probes for monitoring the 
conflicts at all. A detennination is then made whether the 
the selected gene number greater than 80% of the desired 
yes, an T is appended to the end of the gene name at step 
include the whole family of probes perfecdy complementar f 

If there are still not 80% of the desiied 
rq>oited at step 434. Any number of enor handling 
example, an error message may be generated for the user 
not be stored. Altenatively, the user may be prompted to 
probes. 



pruned so that the probes 
Exclusively at step 422. 
rc [naining probes for monitoring 
n umber of probes at step 424. If 
to indicate that the only a 



selected c 



nimberc 



02! 



gene are not reduced by 
i^maining probes for monitoring 
ofprobesatstep430. If 
to indicate that the probes 
to the gene, 
of probes, an error is 
may be undertakeiL For 
probes for the gene may 
a new desiied immberof 



I number 



^procedures 



anlthec 



eiteri 



lchemi(al 



K SyrUhesisof High Datsiiy Arrays 

Methods of forming high density arrays of ol 
other polymer sequences with a minimal number of synthet 
oligonucleotide analogue array can be synthesized on a soli< 1 
methods, including, but not limited to, light-directed 
directed coiq)ling. See Pirrung et oL^ U.S. Patent No. S, 
No. WO 90/1 5070) and Fodor et al., PCT Publication Nos. 
93/09668 which disclose methods of forming vast arrays 
other molecules using, 

of., &jeiice, 251,767-77 (1991). These procedures for 
now refencd to as VLSIPS™ procedures. Using the 
heterogenous array of polymers is converted, through stmt 
of reaction sites, into a different heterogenous anay. Sec, I), 
07/796,243 and 07/980,523 



, for example, light-directed synthesis techniques. 

synlbesisi 
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igomicleatides, peptides and 
c steps are known. The 
substrate by a variety of 
coupling, atui mechanically 
143L854 (see also PCT Application 

WO 92/10092 and WO 
of peptides, oligonucleotides and 
See also, Fodor et 
of polymer arrays are 
SQJproach, one 
luljtaneous coupling at a number 
S. Application Serial Nos. 
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The development of VX^IPS'** technoloj y 
U.S. Patent No. 5,143,854 and PCI patent publication N 3S. 
is considered pioneering technology in the 
of combinatorial libraries. More recently » patent appUca ii 
June 25, 1993 describes methods for making arrays 
used to check or detennine a partial or com^^ete sequenc^ 
delect the presence of a nudeie add containing a specific 

In brief, the light«directed oombinatoria] 
on a glass surface 

tediniques. In one specific tmplementadon, a glass 
reagent containing a functional group, e.g., a hydroxyl or 
photolabile protecting gn>«q>. Photolysis through a photo I; 
selecdvely to expose functional groups which are then 
5*-photopn>tected nucleoside phoq^ramidites. The 
those sites ^ch are ilhimtnated (and thus exposed by 
group). Thus, the phosphonuntdites only add to those 
preceding step. These steps are repeated until 
synthesized on the solid surface. Combinatorial synthesi: 
analogues at different locations on the array is determinec i 
during synthesis and the order of addition of coupling 

In the event that an oligcmucleotlde analodue 
used in die VLSIPS™ procedure, it is generally inapprophi 
chemistry to perform the synthetic steps, since the 
via a phosphate linkage. Instead, peptide synthetic 
Pinung era/. VS. PaL No. 5.143354. 

Pepdde nucleic acids are oommereially 
(Bedford, MA) which comprise a polyamide backbone 
occuning nucleosides. Peptide nucleic adds are 
high specificity, and are considered 
disclosure. 



pho sphonunidites i 



renovaI( 



:iiietbcds 



sand 



"oligonucleotide anal >gues' 



as described in the above-noted 
i. WO 90/15070 and 92/10092, 
synthesis and screening 
ion Serial No. 08/082,937. filed 
of oligonucleotide probes that can be 
of a target nucleic acid and to 
oligonucleotide sequence, 
sjmthesis of oligonucleotide aiiays 
tnaddng 
is derivatizcd with a silane 
amine group blocked by a 
lithogaphic mask is used 
to react with incoming 

react only with 
of the photolabile blocking 
selectively exposed from the 
tvebeen 

of different oligonucleotide 
by the pattern of illumination 



the desired array of sequences hav 



with a polyamide backbone is 
ate to use phosphoramidite 
s do not attach to one another 
are substituted. See,e.g.. 



av; liable i 



from, e.g.. Biosearch, Inc. 
the bases found in naturally 
capable |>f binding to nucleic acids with 
for purposes of this 
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sre< 



> 93/09)68. 



substrate by either (1) i 



tiec 



suiponl 



sreag silts 



In addition to the foregoing, additional methlxis 
generate an amy of otigonucleotides on a single substrate 
Applications Ser. No. 07/980,523, filed November 20, 199^ 
November 22, 1991 and in PCT Publication No. WO 
5 in these applicatioxis, reagents are delivered to the 

channel defined on predefined regions or (2) "spotting" on 
other approaches, as well as combinations of spotting and 
each instance, certain activated regions of the substrate are 
other regions when the monomer sohitions are delivered to 

10 A typical "flow channel" method applied to 

the present invention can gencraUy be described as follows, 
are synthesized at selected regions of a substrate or solid 
on a surfece of the substrate through which apptopii ate 
appropri ate reagents are placed. For example, assume a 

IS the substrate in a first group of selected regions. Ifnecessai^, 
the substrate in all or a part of the selected regions is 
flowing approp iiate reagents through all or some of the 
substrate with appropriate reagents. After placement of a 
the substrate, a reagent having the monomer A flows throug|] 

20 the channel(s). The channels provide fluid omtact to the 

binding the monomer A on the substrate directly or indhectl^ 
selected regicxis. 

Thereafter, a monomer B is coupled to secon{i 
which may be included among the first selected regions. 

25 be in fluid contact with a second flow channel(s) through 
replacement of the channel block on the sur&ce of the 
or closing a selected valve; or through deposition of a 
necessary, a step is performed for activating at least the 
monomer B is flowed through or placed in the second flow 

30 at the second selected locadons. In this particular example, 

to the substrate at this stage of fvooessing will be, for example. 



sactivatd 
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which can be used to 
described in co>pending 
and 07/796,243, filed 

In the methods disclosed 
flowing within a 
[ redefined regions. However, 
fl swing, may be employed In 
1 itehanicaliy separated from 
I he various reaction sites, 
compounds and litMraries of 
Diverse polymer sequences 
by forming flow channels 
flow or in wliich 
'A" is to be bound to 
, all orpartofthesurfeceof 
for binding by, for example, 
or by washing the entire 
Mock on the surface of 
or is placed in all or some of 
selected regions, thereby 
(via a spacer) in the first 



chainels,< 



The 



tis [islati( 



selected regions, some of 
second selected regions will 
ion, rotation, or 
through opening 

orphot(»esist If 
regions. Thereafter, the 
:hannel(s), binding monomer B 
he resulting sequences bound 
A,B.andAB. The process 



layer I >f chemical < 
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riany ( 



lactivstion 



tterei 



ccatings 



with 



It aimer/ 



is rq)eated to form a vast anay of sequences of desired 
substrate. 

After the substrate is activated, monomer 
the channels, monomer B can be flowed through other 
5 flowed through still other channeis, efc In this manner, 
are reacted with a monomer before the channel block mxtk 
be washed and/or reactivated. By making use of many or 
regions simultaneously, the number of washing and 

One of skill in the ait will recognize that 

10 forming channels or otherwise protecting a portion of the 
example, according to some embodiments, a protective 
hydrophobic coating (depending iqxm the nature of the so 
the substrate to be protected, sometimes in combination 
wetdng by the reactantsoluti<m in other regions. In this 

IS further prevented from passing outside of their designated 
According to other embodiments the 
an electron or photiuresist such as those used extensively 
Such matrrials include polymethyl methacrylate (PMMA)|and 
beam resists such as poly(olefin sulfones) and the like <r 

20 of Ghandi, VLSI Fabrication Principles, WUey (1983)). 

aresist is deposited, selectively exposed, andetcbed, leavi^ig 
exposed for coupling. These steps of depositing resist, sel 
monomer coupling are repeated to form polymers of desired 
The ''spotting" methods of preparing 

25 invention can be implemented in much the same manner a 
exaQq)Ie, amonomer A, or a coupled, ordtmer, or trimer, 
syntheized material, can be delivered to and coupled with 
which have been iq>pn^]tiately activated. Thereafter, a m< 
reacted with a second group of activated reaction regions. 

30 embodiments described above, reactants are 
flowing) relatively small quantities of them in 
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le igth at known locations on the 



; channels 



(conip(»unds 



can be flowed through some of 
a monomer C can be 
or all of the reaction regions 
be moved or the substrate must 
all of the available reaction 
steps can be mmimized. 
are alternative methods of 
nnface of the substrate. For 
such as a hydrophilic or 
Ivent) is utilized over portions of 
materials that fedlitate 
the flowing solutions are 
flow paths. 

will be formed by depositing 
the semiconductor industry, 
its derivatives, and electron 
fully described in Chapter 10 
Recording to these embodiments, 

a portion of the substrate 
iljsctively removing resist and 
sequence at desired locations. 

and libraries of the present 
the flow channel methods. For 
r tetiamer, elc, or a fully 
first group of reaction regions 
lomer B can be delivered to and 
Unlike the flow channel 
delivered by c irectly depositing (rather than 



selected reg ons. In some steps, of course. 
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the entire substrate surface can be sprayed or otherwise 
embodiments, a dispenser moves from region to rci 
as necessary at each stop. Typical dispensers include a 
monomer solution to the substrate and a robotic system to 
micropipette with respect to the substrate. Inothercml 
series of tubes, a manifold, an array of pipettes, or the like 
delivered to the reaction regions simultaneously. 



PCT/US97iOl6l» 

coatM with a solution. In preferred 
igion, depbsiting only as much monomer 
mici opipette to deliver the 
control the position of the 
snts, the dispenser includes a 
that various reagents can be 



Idetectabli 



Sths 



com itions ( 



17. HybridizaUon. 

IQ Nucleic add faybridizadon simply involves 

target nucleic acid under conditions where the probe and its 
form stable hybrid dx^lcxes through complementary base 
do not form hybrid duplexes are then washed away leaving 
be detected, typically through detection of an attached 
15 recogmzed that nucleic adds are denatured by increasing 
salt concentration of the buffer containing the nudeic adds 
agents, or the rasiing of the pH. Under low stringency 
and/or high salt and/or high target concentration) hybrid du|)l( 
RN A:RNA. or RN A:DNA) will form even where 
20 complementary. Thus specificity of hybridization is 

Convcreely. at higher stringency {e.g., higher temperature 
hybridization requires fewer mismatches. 

OneofskiUintheartwill 
selected to provide any degree of stringency. InaprefcrTe<l 
25 performed at low stringency in tlus case in 6X SSPE-T at 
(0.005% Triton X-100) to ensure hybridization and 
at higher stringency 1 X SSPE-T at 37-C) to elimina^ 
Successive washes may be performed at increasingly hi: 
low as 0^5 X SSPE-T at 37'C to 50'C) until a desired 
30 obtained Stringency can also be increased by addition 
Hybridization specificity may be evaluated by comparisoi 



Btheannesled 

sreducfd 



rhigliv 
le ^el ( 



p noviding a denatured probe and 
complementary target can 

;. The nudeic adds that 
he hybridized nucleic adds to 

lelabd. It is generally 
temperature or decreasing the 
or in the addition of chemical 

{e.g. , low temperature 
lexes (e.g., DNAiDNA, 

sequences are not perfectly 
at lower stringency, 
lower sah) successful 



1 appicdate that hj bridization 



then SI bsequent washes i 



comUtions may be 
embodiment, hybridization is 
4bout 40**C to about 50*C 

are performed 
mismatched hybrid duplexes, 
stringency (e.g., down to as 
of hybridization specificity is 
ofjagents such as formamide. 
of hybri(fization to the test 
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probes with hybridization to the various controls that can 
control, normalization control, mismatch controls, 

In general, there is a tradeoff between hyb 
and signal intensity. Thus, in a prefened embodiment, thi 
5 stringency that produces consistent results and that provid 
qjproximaleiy 10% of the background intensity. Thus, in 
hybridized array may be washed at successively higher 
between each wash. Analysis of the data sets thus produced 
above which the hybridization pattern is not ^jpreciably 
10 adequate signal for the particular oligonucleotide probes o 
In a preferred embodiment, background sij 
dctd^gent (e.g. . C-TAB) or a blocking reagent (e.g., sperm 
the hybridization to reduce non-specific binding In a 
the hybridization is perfomied in the |»esence of about 0. 1 
15 herring sperm DN A). The use of blocking agents in 

of skill in the ait (see, e,g. , Chapter 8 in P. Tijssen, sig>nL) 

The stability of duplexes fiwmed between 
the order of RNArRNA > RNAiDNA > DNA:DNA, in 
duplex stability with a target, but poorer mismatch 
20 (mismatch discrimination refers to the measuivd 

perfect match probe and a single base migmfltrh piobe), 
discriminate mismatches very well, but the overall duplex 

Altering the thcnnal stability (TJ of the 
and the probe using, e,g„ known 
25 duplex stability and mismatch discriminatioiL One useful 
from the fiict that adenine-lhymine (ArT) di^)Iexes have a 
(G-C) duplexes, due in part to the fact that the A>T duplexed 
base-pair, while the G-C duplexes have 3 hydrogen bonds 
oligonucleotide arrays in which there is a 
30 generally possible to optimize 

simultaneously. Thus, in some embodiments, it is desirable 
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7e present (e.^., expression level 



dization specificity (stringency) 
wash is performed at the highest 
s a signal intensity greater than 
a prefened embodiment, the 
solutions aiKi read 
will reveal a wash stringency 

and which provides 
interest 

al is reduced by the use of a 
DMA. cot-l DNA, etc.) during 
preferred embodiment, 
to about 0.5 mg/ml DNA (e.g., 
is well known to those 



str ngency ! 



altered! 



pattcularly 



I disciim natii 



\ oligonucleotide analogues allows 



,erl 



s hybridization for each oligon ucleotide probe 



hybri< ization 



RNAs or DNAs are generally in 
soli^otL Long probes have better 
ion than shorter probes 
hybridization signal ratio between a 
SI orter probes (eg.. 8-mers) 

! tability is low. 
du]|lex formed between the taigct 
for optimization of 
4spect of altering the T„ arises 
I »wer T„ than guanine-cytosine 
have 2 hydrogen bonds per 
base pair. In heterogeneous 
non-unif(Hm disn ibution of bases, it is not 



to selectively destabilize G-C 
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lammomim 



duplexes and/or to increase the stability of A-T duplexes, 
by substituting guazune residues in the probes of an anay 
hypoxanthine, or by substituting adenine residues in probes 
2,6 diaminopurine or by using the sah tetramethyl 
alhylated ammonium salts) in place of NaCl. 

Altered duplex stability confened by using 
can be ascertained by following, e.g., fluorescence signal L 
analogue anays hybridized with a target oligonucleotide 
optimization of specific hybridization conditions at, e.g., 
diagnostic ai^lications in the future). 

Another way of verifying altered duplex 
intensity generated upon hybridization with time. Previous 
and DN A chips have shown that signal intensity increases 
stable duplexes generate higher signal intensities fester thai 
signals reach a plateau or "saturate" after a certain amount 
sites becoming occupied. These data allow for optimizadc 
detenninatioD of the best conditions at a ^ledfied 

Medx>ds of (qi>timizing hybridiacation 
skill in the art {see, e.g.. Laboratory Techniques in 
Vol 24: Hybridization With Nucleic Acid Probes,?. 
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T|his can be accomplished, e.g.. 
which form G-C duplexes with 
which form A-T duplexes with 
chloride (TMACl or other 



0 Itgonucleotide analogue probes 
intensity of oligonucleotide 

time. The data allow 
ro0m temperature (for simplified 

stability is by following the signal 
experiments using DN A targets 
Yith time, and that the more 
less stable duplexes. The 
<|f time due to all of the binding 
of hybridization, and 



colorii letric 



VIL Detection Meihods 

Mediods for detection depend upon the label 
those of skill in the art Thus, for example, where a 
visualization of the label is sufficient Where a radioactive 
of the radiation {e.g with photographic film or a solid state 

As explained above, the use of a fluoresoen . 
extreme sensitivity and simplicity. Standard procedures 
positions v^toe interactions between a target sequence anc I 
example, if a target sequence is labeled and exposed to an 
oUgomicleotide probes, only those locations where the 



Itcmperatire. 



cottditi ons are well known to those of 
Biocken ustry and Molecular Biology, 
Tljsa n, ed. Elsevier, N.Y., (1993)). 



selected and are known to 
ic label is used, simple 
kdxied probe is used, detection 
detector) is sufficient 
label is preferred because of its 
J used to detennine the 
a reagent take place. For 
anay of different 
olij Eonucleotides interact with the 
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target (sample nucleic 2cid(s)) wilJ exhibit significant sigi a! 
other methods may be used to scan the matrix to detcnnk ^ 
The ^pectntm of interactions can, of course, be detenniim t 
repeated scans of intecactions which occur at each 
However, instead of testing each individual interaction 
sequence interactions may be simultaneously deteimined 



I. In addition to using a label, 
interaction takes place, 
in a temporal manner by 
of a mi:|ltiplicity of conditions. 

a multiplicity of 
amatzix. 



sq>amtely. 



r fluore x»it 



B. Scamiing System 

In a prefeired embodiment, the hybridized 
source at the excitation wavelength of the particular 
Ouorescence at the emission wavelength is detected. In a 
emboduneot, the excitation tight source is a laser appropriii 
fluorescent label. 

Detection of the fluorescence signal prefera )ly 
microscope, more preferably a confocal microscope automated 
stage to automatically scan the entire high density array, 
with a phototransducer (e.^., a photomultiplier, a solid stab 
attadied to an automated data acquisition system to 
signal produced by hybridization to each oligonucleotide pjobe 
automated systems are described at length in U.S. Patent 
20 92/10092. and copending U.S.S.N. 08/195.889 filed on 
illumination in coiqunction with mitftmnteH confocal 
permits detection at a resolution of better than about 100 
about 50 Mm, and most preferably better than about 25 \un. 

Widi die automated ctelecticm apparatus, the 
positional labeling is converted to the presence on the 
oligonucelotides have q^ificity of interaction. Thus, the 
directly converted to a datahasf inHtr^tjng what sequence 
example, in a nucleic add hyfaridizBtion qjplication, the 
between the substrate matrix and the taiget molecnle can 
positional information. A preferred detection system is 



> automs tically 



tN>: 



lnucro»opy 



I Mil, 
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i nay is excited with a light 
label and the resulting 
l^culariy preferred 
ite for the excitation of the 



utilizes a confocal 
with a computer-controlled 
microscope may be equipped 
array, a ccd camera, etc.) 

record the fluorescence 
on the array. Sitch 
5,143.854, PCT Application 
ehraary 10, 1994. Use of laser 
for signal detection 
more preferably better than 



correlation of specific 

for which the 
I ositional information is 
ii}tcractions have occurred. For 
which have interacted 
listed from the 
in PCT publication no. 



sauences\ 
be directly I 
desTibedi 
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i scaimng 



WO9Q/1S070; and U.S.S.N. 07/624.120. Although the 
fluorescence detector, the detector can be replaced by a 
The scanning system can make use of a moving 
fixed detector with a moving substrate, or a oombination. 
apparatus can be used to transfer the signal directly to tiie 
07/624J20. 

The detection method will typically also ii 
to determine whether the signal at a particular matrix 
spurious signal. For example, a signal from a region 
tend to spread over and provide a positive signal in an 
should not have one. This may occur, e.g. , where the 
discriminating with sufficiently high resolution in its pixel 
regions. Thus, the signal over the spatial region may be 
detennine the locations and the actual extern of positive sij 
should, in theory, show a uniform signal at each pixel 
plotting number of pixels with actual signal intensity shouic 
intensity. Regions where the signal intensities show a 
particularly suspect and Ae scanning 
ttiose positions. 

More sophisticated signal processing 
determination ofwhether a positive signal exists or not 5e|, 
and discussion bdow in Section xn, 



det^on described therein is a 
spectroscopic or other detector, 
detector relative to a fixed substrate, a 
/Utematively, minors or other 
detector. Sre, e^., U.S.S.N. 



ma rporate s 



5 system may be progra nmed 



S technic ues 



VIIL Ligation-Enhanced Signal DetectiotL 
A) General L^atioH Reaction, 

Ligation reactions can be used to disctiminaje 
hybrids and those that differ by one or more base pairs, par icuiarty 
ipjgfnjyfffH is near the 5* terminus of tiie probe oUgonucleoti< le. 
signal detection increases the stability of the hytvid duplex 
specificity (particularly for shorter probe oligonucleotides 
optionally, provides additioiiai sequence infomatiDii. 
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some signal processing 
position is a true positive or may be a 
ts actual positive signal may 
adjacjent region which actually 
sjrstem is not properly 
density to separate the two 
evspuated pixel by pixel to 

A true positive signal 
location. Thus, processing by 

have a clearly uniform signal 
fidrl^l wide diq>ernon, may be 

to more carefully scan 



can be applied to the initial 
.e.g.,U,S.S.K 07/624,120 



between fully complementary 

in cases ^^lere the 
. Use of a ligation reaction in 
improves hybridizatioii 
Sto 12mers),and 
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Various components for use of ligation 
generic difference arrays are illustrated in Figure 13a. 
probe oiigonucleotideyiigation reaction system includes i 
As discussed above, the oligonculcleottde piobes can be 
S selected, composition biased, inclusive of all possible 
leng:th» and so forth. The oligonucleotide piobes can 
"constant" region (see Fig. 13a) which has substantially 
all of the probe oligonucleotides on the anay. 

Where the probe comprises a constant 

10 "variable region" (see Fig. 13a) which can be randomly 

composition biased, inclusive of all possible oligonucleof des 
forth. When constam and variable regions are present, a 
hybridizes to the oligonucleotide probe typically hybrid!^ 
and optionally to the constant region as well 

15 The probe oligonucleotide/ligation reactio|> 

a nucleic add that is complementaiy to the constant 
subsequence of a sample nucleic add or a 
to the constant region is a separate oligonucleotide, 
provides a ligation site {see Fig. 1 3a, ligation site A). Th^ 

20 constant region can optionally be permanently crosslinke 1 
of crosMinking reagents {e,g., psoralens). The 
oligonucleotide can optionally be labeled. Where both 
same or distinguishable. 

The probe oligonucleotide/ligation reactio{i 

25 ligatable oligonoucleotide that can be ligated to free 

Fig. 13a, ligation site B). The iigaiable oligonudeotide c 
known nucleotide sequence, a collection of nucleic acids 
all possible oligonculeotides of a particular length. 

These various components of the probe oi 

30 system can be comUned in a variety ofw^ to increases 
and/or improve hybridization ^ledficity (particulariy 



re iction(s) in combination with 
simplest embodiment, the 
array of olignudeotide probes, 
andomly sdected, haphazardly 
oli ^onucleotides of a particular 
opt onally include a predetermined 
tpe same sequence for substantially 

t region it also pre&rably comprises a 
sheeted, haphazardly selected^ 
of a particular length, and so 
Bnq>le nuddc add that 
to at least the variable region 



t regie 0. 



I sqiaiate oligoi udeotide. 
hybriiization 



: sample ni ddc 



stermnus 



system also optionally includes 
This complement may be a 

When the complement 
to the constant region 
hybridized complement to the 
to the constant region by the use 

acid, and/or the ligatable 
labded, the labds can be the 



system optionally indudes a 

ofthe variable region (jee * 
be a single oligonculeotide of 
3f IcRown sequence, or a pool of 



ill gonuclcotide/ligation reaction 
the stability of the hybrid duplex, 
for jshorter probe oligonudcotides 
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eg., 5 to 12 rners), and/or provides sequence infonnation 
oligoiiiicleotide/ligation reaction system aie described in 

While Figure 13a illustrates ligation 
approaches and components can be used in solution phase , 
Older of the constant region and variable region can be 
olig<mucIeotide may comprise multiple constant regions 
In addition, while Fig. 13a illustrates the probe oltgonuclcjotide 
by a 3' tenninus, the probe can also be reversed and 

It will be appredated that sequences or 
oligonucleotide Mvhere variable regions arc present or 
initiation of polymerization using the remainder of the 
ligation oligonucleotide and/or the sample nucleic acid 



Various uses of the probe 
< ietail below. 

in solid phase, similar 
It will be cq[)preciated that the 
In addition, a probe 
a^or multiple variable regions, 
attached to a solid siq>poit 
via the 5' terminus, 
[uences of the probe 
can act as a |»imer site for 
oligonucleotide and/or the 
polymerization template. 



sutseqi 



fpn be 



B) UgaUom Reaetkuts to Dherimmate 

Trnnmi or Both Termini 

In one embodiment, a simple ligation 
near the temunus of the probe oligonculeotide {see Fig 
add fragments comprising the sample nudeic add are 
oligomicleotides in the array. So that, when hybridized, 
an overhang. When the array comprises fHobe oligonuclelotides 
termini, the hybridized target (sample) nucleic acid provides 
embodiment, the target nucleic add is not necessarily 
When the array of oligonucleotides is 
to form target-oligonucleotide hybrid complexes, the 
complexes are contacted with a ligase and a labelled, 
alternatively, with a pool of labdled, ligatable probes, 
sample nudeic adds and the ligatable probes can be 
embodiment both hybridization and ligation arc performc d 
ligatable oligonucleotide, and tigale are all added togetbe r). 
particular preselected probes or may be a collection of all 
length (e.^.. 3 mer up to 12 mer) (^ee. e.g„ Fig. 13b). 



l^maSek ts at Probe Termimi, Target 



1 reaction discriminated mismatches at 
13b). Typically, the nuddc 
lodger than the probe 
tl e target nucleic acid typically has 
attached through thdr 3' 
a 3' overhang. In this 
(see, e.g. Fig. 13b). 
with the target nucleic acid 
targ^t-oltgonucleotide hybrid 
ie oligonucleotide or, 
the hybridization of the 
sequentially, in a preferred 
simultaneously (te.. the target, 
L The pool may comprise 
possible probes of a particular 



labOledC 

combined^ 



, liga 'Mi 
While t 



perfiomeds 
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: acid and 



s which 



The ligation reaction of the labelled, ligata^Ii 
S' end of the oligonucleotide probes on the substrate will 
ligase. pcedominantly when the tafget:oligoniicleotide hybkid 
base-pairing near the 5' end of the oligonucleotide probe 
overhang of the target nucleic acid to serve as a template 
Fig. 12). Aiter the ligation reaction, the substrate is washe< I 
under conditions suitable to remove the target nucleic 
probes (<:g.. above 40*C to 50'C, or under otherwise hi] 

Thereafter, a fluorescence image (e.g., a 
the hylmdization pattern is obtained as described above in 
oligonucleotide j«obcs. Le., the oligonucleotide probes 
target nucleic acid, arc identified. The presence, 
hybridization signal profvides information 
add sequence or subsequence in the nucleic acid sample as 

Any enzyme that catalyzes tiie 
site of a single-stranded break in duplex DNA can be used 
between fully complementary hybrids and those that differ 
Such Ugases include, but are not limited to, T4 DNA ligase 
and ligases isolated from other bacteria and bacteriophages, 
ligase will vary depending on the particular ligase used, the 
buffer conditions, but will typically range from about 50 
Moreover, the time in which the array of targetol 
in contact with the ligase will vary. Typically, the Ugasc 
period of time ranging from minutes to hours. Methods oi 
discrimination can be found in copendii^ USSN 08/533,58^ i, 
and in Jackson et aL (1996) Nature Biotechnology, 14: 168f 

It will be appreciated that the method 
descriminates mismatrhps at or near the S' terminus of the 
oligonucleotide and does little to discriminate mimMtch es 
target (sample) nucleic add (see Fig. 13b). 



absence^ i nd/or intensity t 
I r egarding fee pn sence and 



formation of i phoq)hodiester t 
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:e probes to the phosphorylated 
cjccur, in the presence of the 
has formed with correct 
ajid where there is a suitable 3* 
hybridization and ligation (see 
(multiple times if necessary) 
the labeled, unligated 
higfi|y stringent conditions). 

fluorescent image) of 
Section V1I(B). Labeled 
are complementary to the 
ofthe 
level of the nucleic 
described above. 

bond at the 
e nhanc e disc rimlimrt ftn 
yy one or more base pairs, 
ligases isolated from £ co/i 
The concentration of the 
concentration of target and 
unpts/hil to about 5,000 units/ml. 

complexes is 



fligonudeo ide hybridization complexes i 
tn atment is carried out for a 
f ] letforming ligase 

filed on October 18, 1995 
1691. 

1 descrit ed above primarily 
surface bound probe 

or near, the 5* terminus ofthe 
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In another embodiment, a ligation can be 
at, or near, the end of the sample nucleic acid (Fig. 13c). Ii 
oligonucleotides comprise a constant region and a variable 
regions can include all possible 8 mers as illustrated in Fig. 
5 oligonucleotide (complementary to the constant region or a 
hybridized to the constant region and cross-linked (e.g., co\ 
The remainder of the probe oligonucleotide (e.g.. the 
thereof and optionally a subsequence of the constant region I 
the nucleic acid sample can hybridize. Where there are no 

10 terminus of the sample oligonucleotide, a ligation event 
oligonucleotide to the constant oligonucleotide. Free 
leaving botmd hybridized sample oligonticleotides which 

In sdU another embodiment, , a double ligati|9] 
be used to discriminate mismatches at or near the ends 

IS and the target nucleic add. In this approach, the 

constant region and a variable region as described above in 
oligonucleotide probes are hybridized to a constant 
which is oomplemenlaiy to the constant region of the 
(target) nucleic acids are contacted to the hybrid duplex in 

20 there is no terminal mismatch between the sample nucleic 
ligation is successful resulting in the ligation of the constao t 
nuddc acid {see "first ligation'* in Fig. 13d). This ligation 
at the terminus of the sample nucleic acid. 

The hybridized duplex, is contacted with a 

25 oligonucleotides. Where a ligatable probe is complementa^ 
the hybridized sample nucleic acid and there are no 
of the variable region of the probe oligonucleotide a seconc 
ligatable probe (see Fig. 13d). The second ligation thus di: 
near the free temunus of the probe oligonucleotide. It will 

30 hybridization and ligation reactions may be earned out 

in a prefened embodiment are carried out simultaneously. 
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to discriminate mismatches 
this instance, the probe 
«gion {e.g., the variable 
i3c). Aconsiam 
subsequence thereof) is 
alently bound) at that locatioiL 

or subsequences 
forms a S' overhang toi;^ch 
1 nismatcbes at or near the 
the sample 
acids are washed away 
then be detected, 
in (illustrated in Fig. 13d) can 
the probe oligonucleotide 
each comprise a 
VIII(A). The sur&ce bound 
a sequence 
Hie sample 
presence of a ligase. Where 
and the variable region, the 
oligonucleotide to the sample 

huS «ti«r riinitM>t^ iniyiu^trhf^ 

p|ol of labeled ligatable 

to the overhange produced by 
at or near the free teiminus 
ligation will attach the labeled 
ibriminates against mismatches 
be appreciated that the various 
seqjKntially or simultaneously, and 



variat ie region c 



the 1 joins t 



: nucle c 



sofbdth 



probe olig >nucIeotides 



oligonu deotide having t 
oligoi lucleotide probes. 



tlie 
scid 
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I fully oomple nentaiy 



tuts 



;thi( 



As with the previously described method 
fonnation of a phoqihodicster bond at the site 
be used to enhance discrimination between 
dififcrby oneormorebasepaiis. Such ligases include* 
ligase, Ugases isolated from E. coH and ligases isolated 
bacteriophages. . The concentration of the ligase will var ' 
ligase used, the concentration of tazg^ and 
about 50 unitsM to about 5,000 units/inl. Moreover, 
oligonucleotide:oligonucleotide probe hybrid complexes 
vaiy. Typically, the ligase treatment is carried out for a 
minutes to hours. In addition, it will be readily apparent 
ligation reactions can either be done sequentially or, 
six^e reaction mix that contains: taiget oligonucleotides; 
pool of labeled, tigatable probes; and a ligase. 

In this dual ligation method, the first 
if the 5' end of the target oligonucleotide (le., the last 3-^ 
region of the (^gonudeotide probe. Sbnilarly, the secon 1 
label to the probe, generally occurs efficiently only if the 
successful and if the ligated taiget is complementary to 
method provides for specificity at both ends of the variab 
is advantageous in that tt allows a shorter 
piobeoarget specificity and removes the necessity ol 
methods of diis sort are described tn detail in copending 
October 18. 1995. 

In another embodiment, after hybridizatioi > 
complementary to the constant region 
fonned thereby can be permanently cross-linked so as to 
of the hybrid duplex. When the sample nucleic acid is 
it is also permanently attached to the solid support. Inthi i 
ligatable oligonucleotide is 0|iti<»aL Hie sample 
permitting detection of the ligated sample nucleic adds. 



tie 



r variable probe j egion 



1 of the probe oligon :uleotides. 



ligited 



any enzyme that catalyzes the 
of a single-strand break in duplex DNA can 
hybrids and those that 
not limited to, T4 DNA 
^m other bacteria or 
depending on the particular 
buffer conditjons. but will typically range from 
time in which the anay of target 
is in contact with the ligase will 
I criod of time ranging from fiom 
to those of skill that the two 
alte]|natively, simultaneously in a 
OHistant oligonucleotides; a 

t ligat^n reaction generally occurs only 
bases) matches the variable 
ligation reaction, which adds a 
first ligation reaction was 

5' end of the probe. Thus, this 
e region. Moreover, this method 
to be used; mcreases 
iflabeling the target Dual ligation 
( rSSN 08/533,582, filed on 



of the nucleotide 

the hybrid duplex 
prevent subsequent denaturation 
to the overhang thus formed 
embodiment, the use of a 

labeled tibereby 



nudeic add may itsdf be I 
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1 cross-linking reagents. 



rtbi 



Piefened cross-ltnki ig reagents 



ireageits 
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Methods for cross-linking nucleic acids arc 
the art Such methods include, but are not limited to, 
ioniziDg radiation, and contact with chemical 
piefened embodiment, cross-linldng is accomplished by 
with chemical cioss-linking reagents, 
cross-linking reagents and cross-linldng is accomplished by 
the cross-linking reagent with the nucleic acids. The 
fomiation hybrid duplexes, but in a preferred embodiment, 
attached to either the inobe or complementary (to the 
hybridization. 

The cross-linking reagent can be any 
covalently crosvlinks the tester nucleic acid to a hybridizec 
the cross-linking agent will be a bifimctional i^totoreagent 
the tester or driver nucleic adds leaving a second 
can bind covalently to the conesponding hybridized nuclei< 
The cross-linking molecule may also be a mixed chemical 
reagent which will be non-photochemicaUy bound to the 
diemical reaction such as alkylation, condensation, or 
bimiing to the corresponding hybridized nucleic add. 
molecules activated dther catalytically or by high 
nnay also be employed. 

Examples of btfunctional photoreagents 
benzodipyrones, and bis azides such as bis-azido ethidsum 
bifuncdonal reagents with both chemical and photochemical 
haloalkyl-fuFOcoumarins, haloalkyl benzodipyrones, 
azido nucleoside triphosphates. 

Particulariy prefened ooss-linkers include 
suchas8-nietfaoxypsoraim,5-methoxypsoralinand4,S*, 



veil known to those of skill in 

;, exposure to UV, exposure to 

L In a particularly 
formation of covalem bonds 

include bifimctional 
chemical or photoactivation of 

may be applied after 
he cross-linker is initially 
region) nucldc adds before 



I photoche nicall 



pi }be 



Other suitable cross-linkers indude cis-benzodipyrone and trans-benzodipyrone. 



cross-linker known commercially as Sorlon is also suitable . 
the cross-linking of hybridized nucleic acids see WO 85/0^ 628, 



molecule vdiich 
driver nucleic acid. Generally 
' vhich will be monoadducted to 
lly reactive residue which 
acid upon photoexcitation. 
] nd photochemical bifimctional 
or tester nucleic adds via a 
followed by photochemical 
chemical cross-linking 
following hybridization 



addiiion. 



Biiii Ktional ( 



itempeiat jre 



include iitrocoumarins, 

>romide. Examples of mixed 
binding moieties indude 
baloalkyl-courmarins and various 



] [near furocoumarins (psoralens) 
trimethylps(Halin» and the like. 

The 

For a detailed description of 
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noiet 



The foregoing enhancement discnminati< 
ligation reactions can be used in ail instances where 
compiementaiy hybrids and those that differ by one or 
More particulariy, such methods can be used to more 
{e,g,,de novo sequencing), monitor expression, monitor 
nucleic acid (f.e., such methods can be used in conjuncti4n 
piocedure to provide independent verification). The 
not restrict, the way in which an anay of taiget:oli] 
treated with a ligase and a pool of labeled, ligatable 
signals on high density oligonucleotide anays. 



FCTAIS97/0im 
n methods involving the use of 

impfoved discrimination between fully 
base pairs would be helpful, 
determine the sequence 
1 nutations, or resequence the target 
with a second sequencing 
foregoing is intended to tllustiate, and 
hybrid complexes can be 
to im|m>ve hybridization 



accurately ( 



ligomicl x>tide 



5 probes 



probe oli ^onucleotide ( 



B) l^atum Reaetum io Add Sequence In/orma^m, 
0 Extendi sequence m/onmiwmfnm» 

The ligation reactions described above can 
sequence infonnation obtained regarding the hybridized 
appreciated that the nucleotide sequence of each 
oligonucleotide anay is known. Specific hybridization to 
that the hybridized sample nucleic acid has a sequence or 
the hybridized probe oligonucleotide. Thus a hybridizatic 
infonnation that can be used to identify the nucleic acids ( 
the hybridized sanqf>le. 

by the length oftheinobe oligonucleotide. Thus» where 
mer, 8 nucleotides of sequence infonnation is obtained. 

However, the ligation discrimination 
to provide additional sequence information. Inthiseml 
ligatable oligonucleotide of a given length, the array and 
hybridized to predetermined ligatable oligonucleotides in 
more positions are known. Successful hybridization and 
oligonudeotide thus indicates that the hybridized sample 
compiementaiy to die ligatable oligonucleotide in 



Generally speaking, the sequence i nfonnation obtained 



iieactidns 



mmpUl^iiou. 

also be used to increase the 
n|jcleic acid. It will be 

on the high density 
a san^ie nucleic acid indicates 
ubsequence complementary to 
event provides sequence 
gi. gene transcripts) present in 
is governed 
probe oligonucleotide is an 8 



ibod ment, i 



described above can be used 
rather than eveiy possible 
s^ple nucleic acids are 
^ch the nucleotides at one or 
of the label 
acid has nucleotides 
oligonucleotide. 



nideicc 



addition to the probe t 



W097/roi7 



10 



15 



20 



25 



30 



78 



Thus, for example, where the 
6 mer ligatable probes are used, the resulting hybridization 
woTth of sequence infonnation. 

What different ligatable oligonucleotides 
desirable to distingiush between the various 
accomplished by sequential ligations with each different 
followed by reading of the array. Alternatively, each 
can be labeled widi a dififerent detection label allowing 
subsequnt detection of the various different labels. 



probe oligonucleotide is an 8 mer and specific 
will provide 14 nucleotides 



t ligated oligooi cleotides. 



species 



speci is 



siir ultaneous 



SO) 



bp sequen x) adjacent 



usidD^chc 



Urgeta 



is) Use of a gentric i^athn GeneChip/or 

adfocent to rtU rk ium sites in a 

The generic difference anrays can be used 
clones or to monitor the complex pattern of gene expressi( 
finger|ffinting a nucleic add sequence an 8 
restriction en^me site is sequenced. 

In fing'Hpri ^ ' ^ ' ^ restriction en^rme is 
frequency depemtem on the length of the recognition sequence, 
generate nucleic acid firagroents approximately uniformly 
DNA. For instance, a 4-cimer like Hsp92 11 would cut a 
hundered basepairs, whereas a 6-cutter, like Sad would 
several thousand (4,000) basepairs. With restriction 
fiagments are typically non-overlapping and average 
For the purposes of fingerprinting, with a 6-cuttcr 
examine (2000-3000 fragments X 4000 bases/fragment » 
This indicates that it is possible to routinely sort an 8- 12 
density array to measure exjnession differences or to 
Fig. 1 4c) thereby providing a characteristic expression 
difference fingerprint for each restriction digest of the sariiple 
fingerprinting metibods thus provide means to subsample 



i nterrogadng sequences 



eongfiex (l irget) sampte nueteie add. 

fingerprint complex DNA 
from a given source. In 
to a given 



Icit 



lenzyiie 

^seveal 



rrestncti >n 



imonitor 
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used in this context, it is 
This can be 
of ligatable {Hobe 
of ligatable oligonucleotide 
ligation and 



cleaves tiie target at a 
The restriction digest thus 
< Listributed along tiie genomic 
about once every several 
a target about once eveiy 
fragments, the individual 
thousand basepairs in length, 
enzyme it is possible to 
1 2 million basepairs per target 
i^lion basepair target in a high 
gene expression (see, e.g., 
filigerprint" or abundance 
nucleic acid. The 
nucleic acid population in a 



W097/2TSI7 



79 



i subsan ipled. 



roughly uniform and rcpnxiuctble manner and determine 
abundance di ffe renoes for target nucleic acid thus 

In general, the nietfaod involves providing 
screening array where the probe oligonucleotides compri e 
r^on as described above. In this instance, however, the 
region (anchor sequence) arc selected to complement the 
recognition siteCsee. e.g.. Figs. 14a and 14b) and 
shortened by the appipriate number of bases. Thevariabl: 
selected, haphazardly selected, composition biased as 
preferred embodiment, the variable region include all 
length (eg., ail possible 3 mcrs, ail possible 4 mers . . 
preferably all possible 8 mers. 

The sample nucletc acids are prepared by 
enz3rme. Preferred icstrictioo enzymes leaving only 0, 1 , 
greater specificity of Ugation (i.e., Sacl leaves just a 5' C 
recognitioD site basest die 5* end). However, lestiiction 
the 5* end can be used. Several restriction enzymes can 
leave the same recognition base at the 5' end. Forinstancc, 
Bspl286I, Apal, Kpn I, Ban Q. aU leave just a C at the S' 
en^noes. RestrietioD en^rmes and their characteristic 
known to those of skill in the art (jee: e,g. 
Inc. Palo Alto, Ca). 

The digested target is then hybridized and 
preferably in the presence of a complement to the constant 
conditions (e.g.. 30"Q o/n, 800 U T4 ligase. T4 
sorts (locates and/or localizes) the sample nucleic acids the 
acids being determined by the sequence of the bases adjace it 
end. The hybridization data can be used diiecdy in an 
described above, or the same procedure can be performed 
acids for generic difibence 



a high density generic difference 
a constant region and a variable 
last few bases of the constant 
V end of the restriction 



. CkmeTech cata bgue. 



expression 
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expression profiles and/or 



region can be randomly 
dcsi dbed above. However, in a 
posi ible nucleic acids of a particular 
aU| possible 12 mers). more 

fi|agmentation using a restriction 
2 bases at the 5' end provide a 
H^ II leaves no 
< ^azymts leaving more bases at 
simultaneously if they all 
Aatn,SacI. SphI,HhaI 
making these compatible 
rec(|gnition/cleavage sites are well 
Clonetech Laboratories 



and 1 



be useds 



e[idr 



li gated to the high density array, 
region^ using standard 

L The hybridization in effect 
position of the sample nuclei 
to the restriction site at the 5* 

monitoring method as 
two or more sample nucleic 
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I foimits 



In a preferiTs^^bodiiiicnt, one of two 
ligated fragment {e,g, the sample nucleic acid and, optional 
constant region) is locked into place in the high density 
cross-linking) to the complement (e.;., by the use of a psodlen). 
5 strand to the fragment can be denatured and washed off of iie 
I N NaOH). These cross-linked fragments can then be usee 
hybridization to one or more nucleic acid samples. Differei tial 
(e.g.» differential gene expression) can then be monitored b^ 
pattern between different nucleic adds hybridized simuhant ously 
10 same array or separate arrays. 

In a second format (format II), paiticularty 
a deoxynucleic acid sample, the DNA is restriction digested 
directly faybridized/ligated to the generic diffeicnoe array. 

15 differentially expressed) nucleic acid can be cloned 

nucleic acid based upon the sequence information derived 
the array and the scq[uence of the recognition site. For an 8 
base restrictino enzyme, this gives a 14 mer primer sequenc^. 
primer may be used to isolate the clone. Longer genomes 

20 length of the primary probes (variable region) increases beybnd 
The restriction em^me digested sample nuci sic 
and ligated to the hig|h density array in fingerprinting methc|d 
discusaon above and Fig. ]4d). In the case of format I 
preferably not labeled and instead, serves as a hybridization 

25 hybridization of labeled sample nucleic acids to the high de isity 
To insure that sites which have not 
enzymes do not ligate to the high density aray, alkaline 
sample nucleic adds before restriction ens^me digestion. 



I by designing 
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are used. In Format I, the 
y, the complement to the 
by its attachment (e.g.. by 

The complementary 
anay with a dilute base (e.g., 
as probes in a second round of 

nucldc acid abundances 
comparing the hybriidzation 
or sequentially to the 



M^ere the sample nucleic acid is 

as described above, and then 
^ites where intensity differences 
4 lifferentially abundant (fi.g., 
(Timers ^lecific to that 
the location of the probe in 
ner variable region) and a 6 
L For short genomes, a 14 mer 
b^me more tractable as the 
8 meis. 

acid is preferably labded 
and in fr>rmat 11 (see 
the ligated target sequence is 
probe in a second round of 
array. 

by the given restriction 
can be used to treat the 



phs >phatase i 
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iU) Anafyis of d^tmtial di^tiayfiagmaa r an a generic difference am^. 



